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Coding Theory 

A First Course 

Coding theory is concerned with successfully transmitting data through a noisy 
channel and correcting errors in corrupted messages. It is of central importance for 
many applications in computer science or engineering. This book gives a 
comprehensive introduction to coding theory whilst only assuming basic linear 
algebra. It contains a detailed and rigorous introduction to the theory of block codes 
and moves on to more advanced topics such as BCH codes, Goppa codes and Sudan’s 
algorithm for list decoding. The issues of bounds and decoding, essential to the design 
of good codes, feature prominently. 

The authors of this book have, for several years, successfully taught a course on coding 
theory to students at the National University of Singapore. This book is based on their 
experiences and provides a thoroughly modern introduction to the subject. There is a 
wealth of examples and exercises, some of which introduce students to novel or more 
advanced material. 
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Preface 


In the seminal paper ‘A mathematical theory of communication’ published in 
1948, Claude Shannon showed that, given a noisy communication channel, there 
is a number, called the capacity of the channel, such that reliable communication 
can be achieved at any rate below the channel capacity, if proper encoding and 
decoding techniques are used. This marked the birth of coding theory, a field 
of study concerned with the transmission of data across noisy channels and the 
recovery of corrupted messages. 

In barely more than half a century, coding theory has seen phenomenal 
growth. It has found widespread application in areas ranging from communi- 
cation systems, to compact disc players, to storage technology. In the effort to 
find good codes for practical purposes, researchers have moved beyond block 
codes to other paradigms, such as convolutional codes, turbo codes, space-time 
codes, low-density-parity-check (LDPC) codes and even quantum codes. While 
the problems in coding theory often arise from engineering applications, it is 
fascinating to note the crucial role played by mathematics in the development 
of the field. The importance of algebra, combinatorics and geometry in coding 
theory is a commonly acknowledged fact, with many deep mathematical results 
being used in elegant ways in the advancement of coding theory. 

Coding theory therefore appeals not just to engineers and computer scien- 
tists, but also to mathematicians. It has become increasingly common to find the 
subject taught as part of undergraduate or graduate curricula in mathematics. 

This book grew out of two one-semester courses we have taught at the 
National University of Singapore to advanced mathematics and computer 
science undergraduates over a number of years. Given the vastness of the 
subject, we have chosen to restrict our attention to block codes, with the aim 
of introducing the theory without a prerequisite in algebra. The only mathe- 
matical prerequisite assumed is familiarity with basic notions and results in 
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linear algebra. The results on finite fields needed in the book are covered in 
Chapter 3. 

The design of good codes, from both the theoretical and practical points of 
view, is a very important problem in coding theory. General bounds on the 
parameters of codes are often used as benchmarks to determine how good a 
given code is, while, from the practical perspective, a code must admit an effi- 
cient decoding scheme before it can be considered useful. Since the beginning 
of coding theory, researchers have done much work in these directions and, in 
the process, have constructed many interesting families of codes. This book is 
built pretty much around these themes. A fairly detailed discussion on some 
well known bounds is included in Chapter 5, while quite a number of decoding 
techniques are discussed throughout this book. An effort is also made to in- 
troduce systematically many of the well known families of codes, for example, 
Hamming codes, Golay codes, Reed-Muller codes, cyclic codes, BCH codes, 
Reed-Solomon codes, alternant codes, Goppa codes, etc. 

In order to stay sufficiently focused and to keep the book within a manageable 
size, we have to omit certain well established topics or examples, such as a 
thorough treatment of weight enumerators, from our discussion. Wherever 
possible, we try to include some of these omitted topics in the exercises at 
the end of each chapter. More than 250 problems have been included to help 
strengthen the reader’s understanding and to serve as an additional source of 
examples and results. 

Finally, it is a pleasure for us to acknowledge the help we have received 
while writing this book. Our research work in coding theory has received 
generous financial assistance from the Ministry of Education (Singapore), the 
National University of Singapore, the Defence Science and Technology 
Agency (Singapore) and the Chinese Academy of Sciences. We are thank- 
ful to these organizations for their support. We thank those who have read 
through the drafts carefully and provided us with invaluable feedback, espe- 
cially Fangwei Fu, Wilfried Meidl, Harald Niederreiter, Yuansheng Tang (who 
has also offered us generous help in the preparation of Section 9.4), Arne 
Winterhof and Sze Ling Yeo, as well as the students in the classes MA3218 
and MA4261. David Chew has been most helpful in assisting us with problems 
concerning DTpX, and we are most grateful for his help. We would also like to 
thank Shanthi d/o Devadas for secretarial help. 



1 Introduction 


Information media, such as communication systems and storage devices of 
data, are not absolutely reliable in practice because of noise or other forms 
of introduced interference. One of the tasks in coding theory is to detect, or 
even correct, errors. Usually, coding is defined as source coding and channel 
coding. Source coding involves changing the message source to a suitable 
code to be transmitted through the channel. An example of source coding is 
the ASCII code, which converts each character to a byte of 8 bits. A simple 
communication model can be represented by Fig. 1.1. 

Example 1.0.1 Consider the source encoding of four fruits, apple, banana, 
cherry, grape, as follows: 

apple — > 00, banana — > 0 1 , cherry — > 10, grape — »• 11. 

Suppose the message ‘apple’, which is encoded as 00, is transmitted over a 
noisy channel. The message may become distorted and may be received as 01 
(see Fig. 1.2). The receiver may not realize that the message was corrupted. 
This communication fails. 

The idea of channel coding is to encode the message again after the source 
coding by introducing some form of redundancy so that errors can be detected 
or even corrected. Thus, Fig. 1.1 becomes Fig. 1.3. 


message source 

4 - 

| source encoder 

Fig. 1.1. 
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Fig. 1.2. 
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Fig. 1.3. 


Example 1.0.2 In Example 1.0.1, we perform the channel encoding by intro- 
ducing a redundancy of 1 bit as follows: 

00^ 000, 01 ^ Oil, 10 ->-101, 11 ^ 110. 

Suppose that the message ‘apple’, which is encoded as 000 after the source 
and channel encoding, is transmitted over a noisy channel, and that there is only 
one error introduced. Then the received word must be one of the following three: 
100, 010 or 001. In this way, we can detect the error, as none of 100, 010 or 

001 is among our encoded messages. 

Note that the above encoding scheme allows us to detect errors at the cost 
of reducing transmission speed as we have to transmit 3 bits for a message of 

2 bits. 

The above channel encoding scheme does not allow us to correct errors. For 
instance, if 100 is received, then we do not know whether 100 comes from 000, 
1 10 or 101. However, if more redundancy is introduced, we are able to correct 
errors. For instance, we can design the following channel coding scheme: 

00^ 00000, 01 ->-01111, 10 ->-10110, 11 ->-11001. 

Suppose that the message ‘apple’ is transmitted over a noisy channel, and that 
there is only one error introduced. Then the received word must be one of the 
following five: 10000, 01000, 00100, 00010 or 00001. Assume that 10000 is 
received. Then we can be sure that 10000 comes from 00000 because there are 
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Fig. 1.4. 

at least two errors between 10000 and each of the other three encoded messages 
01111, 10110 and 11001. 

Note that we lose even more in terms of information transmission speed in 
this case. 

See Fig. 1.4 for this example. 

Example 1.0.3 Here is a simple and general method of adding redundancy for 
the purpose of error correction. Assume that source coding has already been 
done and that the information consists of bit strings of fixed length k. Encoding 
is carried out by taking a bit string and repeating it 2r + 1 times, where r > 1 
is a fixed integer. For instance, 

01 — » 0101010101 

if k = 2 and r = 2. In this special case, decoding is done by first considering 
the positions 1, 3, 5, 7, 9 of the received string and taking the first decoded bit 
as the one which appears more frequently at these positions; we deal similarly 
with the positions 2, 4, 6, 8, 10 to obtain the second decoded bit. For instance, 
the received string 

1100100010 

is decoded to 10. It is clear that, in this special case, we can decode up to two 
errors correctly. In the general case, we can decode up to r errors correctly. 
Since r is arbitrary, there are thus encoders which allow us to correct as many 
errors as we want. For obvious reasons, this method is called a repetition 
code. The only problem with this method is that it involves a serious loss of 
information transmission speed. Thus, we will look for more efficient methods. 

The goal of channel coding is to construct encoders and decoders in such a 
way as to effect: 


(1) fast encoding of messages; 
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(2) easy transmission of encoded messages; 

(3) fast decoding of received messages; 

(4) maximum transfer of information per unit time; 

(5) maximal detection or correction capability. 

From the mathematical point of view, the primary goals are (4) and (5). 
However, (5) is, in general, not compatible with (4), as we will see in Chapter 5. 
Therefore, any solution is necessarily a trade-off among the five objectives. 

Throughout this book, we are primarily concerned with channel coding. 
Channel coding is also called algebraic coding as algebraic tools are extensively 
involved in the study of channel coding. 

Exercises 

1 . 1 Design a channel coding scheme to detect two or less errors for the message 
source {00, 10, 01, 11}. Can you find one of the best schemes in terms of 
information transmission speed? 

1.2 Design a channel coding scheme to correct two or less errors for the 
message source {00, 10, 01, 11}. Can you find one of the best schemes 
in terms of information transmission speed? 

1.3 Design a channel coding scheme to detect one error for the message source 

{ 000 , 100 , 010 , 001 , 110 , 101 , 011 , 111 }. 

Can you find one of the best schemes in terms of information transmission 
speed? 

1 .4 Design a channel coding scheme to correct one error for the message source 

{ 000 , 100 , 010 , 001 , 110 , 101 , 011 , 111 }. 

Can you find one of the best schemes in terms of information transmission 
speed? 
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Error detection, correction 
and decoding 


We saw in Chapter 1 that the purpose of channel coding is to introduce redun- 
dancy to information messages so that errors that occur in the transmission can 
be detected or even corrected. In this chapter, we formalize and discuss the 
notions of error-detection and error-correction. We also introduce some well 
known decoding rules, i.e., methods that retrieve the original message sent by 
detecting and correcting the errors that have occurred in the transmission. 


2.1 Communication channels 

We begin with some basic definitions. 


Definition 2.1.1 Let A = {a \ , a-i, . ■ ■ , a q } be a set of size q, which we refer to 

as a code alphabet and whose elements are called code symbols. 

(i) A q-ary word of length n over A is a sequence w = w \ W 2 • • • w n with 
each Wj e A for all i. Equivalently, w may also be regarded as the vector 
(uq, . . . , w„). 

(ii) A q-ary block code of length n over A is a nonempty set C of q - ary words 
having the same length n. 

(iii) An element of C is called a codeword in C . 

(iv) The number of codewords in C , denoted by | C | , is called the size of C . 

(v) The ( information ) rate of a code C of length n is defined to be (log ? | C \ )/ n . 

(vi) A code of length n and size M is called an (n. M)-code. 


Remark 2.1.2 In practice, and especially in this book, the code alphabet is 
often taken to be a finite field F q of order q (cf. Chapter 3). 
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Fig. 2.1. 

Example 2.1.3 A code over the code alphabet F 2 = {0, 1} is called a binary 
code ; i.e., the code symbols for a binary code are 0 and 1. Some examples of 
binary codes are: 

(i) Ci = {00, 01, 10, 11} is a (2,4)-code; 

(ii) C 2 = {000, 011, 101, 110} is a (3,4)-code; 

(iii) C 3 = {0011,0101, 1010, 1100, 1001, 0110} is a (4,6)-code. 

A code over the code alphabet F 3 = {0, 1, 2} is called a ternary code, while 
the term quaternary code is sometimes used for a code over the code alphabet 
F4. However, a code over the code alphabet Z4 = {0, 1, 2, 3} is also sometimes 
referred to as a quaternary code (cf. Chapter 3 for the definitions of F 3 , F 4 
and Z4). 

Definition 2.1.4 A communication channel consists of a finite channel al- 
phabet A = {ai, . . . , a q } as well as a set of forward channel probabilities 
Viflj received | a, sent), satisfying 

9 

Y, Via j received | a, sent) = 1 
•M 

for all i (see Fig. 2.1). (Here, V(aj received | a, sent) is the conditional proba- 
bility that a j is received, given that a, is sent.) 

Definition 2.1.5 A communication channel is said to be memoryless if the 
outcome of any one transmission is independent of the outcome of the previous 
transmissions; i.e., if c = cic 2 • • • c n and x = X |X 2 ■ ■ ■ x n are words of length n, 
then 

V(\ received | c sent) = ]~~[ V(x t received | c, sent). 
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Fig. 2.2. Binary symmetric channel. 

Definition 2.1.6 A q-ary symmetric channel is a memoryless channel which 
has a channel alphabet of size q such that 

(i) each symbol transmitted has the same probability p (< 1/2) of being re- 
ceived in error; 

(ii) if a symbol is received in error, then each of the q — 1 possible errors is 
equally likely. 

In particular, the binary symmetric channel ( BSC ) is a memoryless channel 
which has channel alphabet {0, 1} and channel probabilities 

V(1 received | 0 sent) = V(0 received | 1 sent) = p, 

V(0 received | 0 sent) = V(\ received | 1 sent) = 1 — p. 

Thus, the probability of a bit error in a BSC is p. This is called the crossover 
probability of the BSC (see Fig. 2.2). 

Example 2.1.7 Suppose that codewords from the code {000, 111} are being 
sent over a BSC with crossover probability p = 0.05. Suppose that the word 
1 10 is received. We can try to find the more likely codeword sent by computing 
the forward channel probabilities: 

V{\ 10 received | 000 sent) = V{\ received | 0 sent) 2 x V(0 received | 0 sent) 
= (0.05) 2 (0.95) = 0.002375, 

V{ \ 10 received | 1 1 1 sent) = V(\ received | 1 sent) 2 xp(0 received | 1 sent) 
= (0.95) 2 (0.05) = 0.045125. 

Since the second probability is larger than the first, we can conclude that 111 
is more likely to be the codeword sent. 
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Decoding rule 

In a communication channel with coding, only codewords are transmitted. 
Suppose that a word w is received. If w is a valid codeword, we may conclude 
that there is no error in the transmission. Otherwise, we know that some errors 
have occurred. In this case, we need a rule for finding the most likely codeword 
sent. Such a rule is known as a decoding rule. We discuss two such general 
rules in this chapter. Some other decoding rules, which may apply to certain 
specific families of codes, will also be introduced in subsequent chapters. 

2.2 Maximum likelihood decoding 

Suppose that codewords from a code C are being sent over a communication 
channel. If a word x is received, we can compute the forward channel proba- 
bilities 


V(x received | c sent) 

for all the codewords c e C. The maximum likelihood decoding ( MLD ) rule 
will conclude that c x is the most likely codeword transmitted if c x maximizes 
the forward channel probabilities; i.e., 

P(x received I c x sent) = max V(x received I c sent). 

ceC 

There are two kinds of MLD: 

(i) Complete maximum likelihood decoding ( CMLD ). If a word x is received, 
find the most likely codeword transmitted. If there are more than one such 
codewords, select one of them arbitrarily. 

(ii) Incomplete maximum likelihood decoding ( IMLD ). If a word x is received, 
find the most likely codeword transmitted. If there are more than one such 
codewords, request a retransmission. 

2.3 Hamming distance 

Suppose that codewords from a code C are being sent over a BSC with crossover 
probability p < 1/2 (in practice, p should be much smaller than 1 /2). If a word 
x is received, then for any codeword c e C the forward channel probability is 
given by 

V(x received | c sent) = p e ( 1 — p)"~ e , 

where n is the length of x and e is the number of places at which x and c differ. 
Since p < 1/2, it follows that 1 — p > p, so this probability is larger for 
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larger values of n — e, i.e., for smaller values of e. Hence, this probability is 
maximized by choosing a codeword c for which e is as small as possible. This 
value e leads us to introduce the following fundamental notion of Hamming 
distance. 


Definition 2.3.1 Let x and y be words of length n over an alphabet A. The 
( Hamming ) distance from x to y , denoted by d (x, y), is defined to be the number 
of places at which x and y differ. If x = x\ ■ ■ ■ x n and y = yi • • • y n , then 


d(x, y) = d(xi, yi) 4 h d(x n ,y n ), (2.1) 


where jc,- and v, are regarded as words of length 1, and 


d(Xi, ji ) 


1 if Xi ± y t 
0 if Xi = y t . 


Example 2.3.2 (i) LetA = {0, ljandletx = 01010, y = 01101, z = 11101. 
Then 


d(x, y) = 3, 
d( y, z) - 1, 
d( z, x) = 4. 

(ii) Let A = {0, 1, 2, 3, 4} and let x = 1234, y = 1423, z = 3214. Then 

d(x, y) = 3, 
d( y, z) = 4, 
d( z, x) = 2. 


Proposition 2.3.3 Let x, y, z be words of length n over A. Then we have 

(i) 0 < d(x, y) < n, 

(ii) d(x, y) = 0 if and only ifx — y, 

(iii) d(x,y) = d(y,x), 

(iv) (Triangle inequality.) d(x, z) < d(x, y) + d{ y, z). 

Proof, (i), (ii) and (iii) are obvious from the definition of the Hamming distance. 
By (2.1), it is enough to prove (iv) when n = 1, which we now assume. 

If x = z, then (iv) is obviously true since d(x. z) = 0. 

If x f then either y ^ x or y / z, so (iv) is again true. ' 
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2.4 Nearest neighbour/minimum distance decoding 

Suppose that codewords from a code C are being sent over a communication 
channel. If a word x is received, the nearest neighbour decoding rule (or 
minimum distance decoding rule ) will decode x to c x if d(x, c x ) is minimal 
among all the codewords in C, i.e., 

d(x, c x ) = mind(x, c). (2.2) 

ceC 

Just as for the case of maximum likelihood decoding, we can distinguish 
between complete and incomplete decoding for the nearest neighbour decoding 
rule. For a given received word x, if two or more codewords c x satisfy (2.2), then 
the complete decoding rule arbitrarily selects one of them to be the most likely 
word sent, while the incomplete decoding rule requests for a retransmission. 

Theorem 2.4.1 For a BSC with crossover probability p < 1/2, the maximum 
likelihood decoding rule is the same as the nearest neighbour decoding rule. 

Proof. Let C denote the code in use and let x denote the received word (of 
length n). For any vector c of length n, and for any 0 < i <n, 

d(x, c) = / O V(x received | c sent) = p'(\ — p)"~‘ . 

Since p < 1/2, it follows that 

P°( 1 - P) n > p\ 1 - P) n ~ X > p\ 1 - P) n ~ 1 > > P n ( 1 - P)°. 

By definition, the maximum likelihood decoding rule decodes xtoceC such 
that Vix received | c sent) is the largest, i.e., such that d(x, c) is the smallest 
(or seeks retransmission if incomplete decoding is in use and c is not unique). 
Hence, it is the same as the nearest neighbour decoding rule. □ 

Remark 2.4.2 From now on, we will assume that all BSCs have crossover 
probabilities p < 1/2. Consequently, we can use the minimum distance 
decoding rule to perform MLD. 

Example 2.4.3 Suppose codewords from the binary code 

C = {0000, 0011, 1000, 1100, 0001, 1001} 
are being sent over a BSC. Assuming x = 01 1 1 is received, then 
d(011 1,0000) = 3, 
d(0111, 0011) = 1, 
d(0111, 1000) = 4, 
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Table 2.1. IMLD table for C. 


Received x 

d(x, 000) 

d(x, 011) 

Decode to 

000 

0 

2 

000 

100 

1 

3 

000 

010 

1 

1 

- 

001 

1 

1 

- 

110 

2 

2 

- 

101 

2 

2 

- 

Oil 

2 

0 

Oil 

111 

3 

1 

Oil 


d{ 0111, 1100) = 3, 
d(0U 1,0001) = 2, 
d( 0111, 1001) = 3. 

By using nearest neighbour decoding, we decode x to 001 1. 

Example 2.4.4 Let C — {000, 011} be a binary code. The IMLD table for C 
is as shown in Table 2.1, where means that retransmission is sought. 

2.5 Distance of a code 

Apart from the length and size of a code, another important and useful charac- 
teristic of a code is its distance. 

Definition 2.5.1 For a code C containing at least two words, the ( minimum ) 
distance of C, denoted by d(C), is 

d{C) = rmn{d (x, y) : x, y e C, x / y). 

Definition 2.5.2 A code of length n, size M and distance d is referred to as 
an («, M, d)-code. The numbers n, M and d are called the parameters of the 
code. 

Example 2.5.3 (i) Let C = {00000, 00111, 11111} be a binary code. Then 
d(C) = 2 since 

(00000, 00111) = 3, 

<f(00000, 11111) = 5, 

d(00111, mil) = 2. 


Hence, C is a binary (5,3,2)-code. 
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(ii) Let C = {000000, 000111, 111222} be a ternary code (i.e. with code 
alphabet {0, 1, 2}). Then d(C) — 3 since 

£/(000000, 000111) = 3, 

<*( 000000 , 111222 ) = 6 , 

<*( 000111 , 111222 ) = 6 . 

Hence, C is a ternary (6,3,3)-code. 

It turns out that the distance of a code is intimately related to the error- 
detecting and error-correcting capabilities of the code. 

Definition 2.5.4 Let u be a positive integer. A code C is u-error-detecting if, 
whenever a codeword incurs at least one but at most u errors, the resulting word 
is not a codeword. A code C is exactly u -error-detecting if it is u -error-detecting 
but not ( u + l)-error-detecting. 

Example 2.5.5 (i) The binary code C = {00000,00111, 11111} is 1 -error- 
detecting since changing any codeword in one position does not result in another 
codeword. In other words, 

00000 — > 00111 needs to change three bits, 

00000 1 1 1 1 1 needs to change five bits, 

00111 — > 11111 needs to change two bits. 

In fact, C is exactly 1 -error-detecting, as changing the first two positions of 
00111 will result in another codeword 11111 (so C is not a 2-error-detecting 
code). 

(ii) The ternary code C = {000000, 000111, 111222} is 2-error-detecting 
since changing any codeword in one or two positions does not result in another 
codeword. In other words, 

000000 — > 000111 needs to change three positions, 

000000 111222 needs to change six positions, 

00011 1 — »■ 1 11222 needs to change six positions. 

In fact, C is exactly 2-error-detecting, as changing each of the last three positions 
of 000000 to 1 will result in the codeword 000 1 1 1 (so C is not 3-error-detecting). 

Theorem 2.5.6 A code C is u-error-detecting if and only ifd(C ) > u + 1; i.e., 
a code with distance d is an exactly (d — \)-error-detecting code. 

Proof. Suppose <*(C) > u + \ . IfceC and x are such that 1 < d(c, x) < u < 
d{C), then x f C: hence, C is u-error-detecting. 



2.5 Distance of a code 


13 


On the other hand, if d(C) < u+l,i.e.,d(C) < u, then there exist c i, C 2 £ C 
such that 1 < £/(C| , C 2 ) = d(C) < u. It is therefore possible that we begin 
with Ci and d(C) errors (where 1 < d(C) < u) are incurred such that the 
resulting word is C 2 , another codeword in C . Hence, C is not a u -error-detecting 
code. □ 

Remark 2.5.7 An illustration of Theorem 2.5.6 is given by comparing 
Examples 2.5.5 and 2.5.3. 

Definition 2.5.8 Let v be a positive integer. A code C is v-error-correcting if 
minimum distance decoding is able to correct v or fewer errors, assuming that 
the incomplete decoding rule is used. A code C is exactly v-error-correcting 
if it is v -error-correcting but not (v + l)-error-correcting. 

Example 2.5.9 Consider the binary code C = {000, 111}. By using the mini- 
mum distance decoding rule, we see that: 

• if 000 is sent and one error occurs in the transmission, then the received word 
(100, 010 or 001) will be decoded to 000; 

• if 1 1 1 is sent and one error occurs in the transmission, then the received word 
(110, 101 or 011) will be decoded to 111. 

In all cases, the single error has been corrected. Hence, C is 1 -error-correcting. 

If at least two errors occur, the decoding rule may produce the wrong code- 
word. For instance, if 000 is sent and 01 1 is received, then 011 will be decoded 
to 1 1 1 using the minimum distance decoding rule. Hence, C is exactly 1 -error- 
correcting. 

Theorem 2.5.10 A code C is v-error-correcting if and only if d(C ) > 2v + 1/ 
i.e., a code with distance d is an exactly \_(d — 1 )/2J -error-correcting code. 
Here, [_x\ is the greatest integer less than or equal to x. 

Proof. ‘-4=’ Suppose that d(C) > 2v + 1. Let c be the codeword sent and let 
x be the word received. If v or fewer errors occur in the transmission, then 
d(x, c) < v. Hence, for any codeword c'eC,c^c',we have 

d(x, c') > d{ c, c 7 ) - d(x, c) 

>2v + 1 - v 
= v+l 
> d(x, c). 


Thus, x will be decoded (correctly) to c if the minimum distance decoding rule 
is used. This shows that C is v -error-correcting. 
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Fig. 2.3. 



*'=£-’ Suppose that C is u-error-correcting. If d(C) < 2v + 1, then there are 
distinct codewords c, c' e C with d( c, c') = d(C) < 2v. We shall show that, 
assuming c is sent and at most v errors occur, it can occur that minimum distance 
decoding will either decode the received word incorrectly as c' or report a tie 
(and hence these errors cannot be corrected if the incomplete decoding rule is 
used). This will contradict the assumption that C is v -error-correcting, hence 
showing that d(C) > 2v + 1. 

Notice that, if d( c, c') < v + 1 , then c could be changed into c' by incurring 
at most v errors, and these errors would go uncorrected (in fact, undetected!) 
since c' is again in C. This, however, would contradict the assumption that C is 
u-error-correcting. Therefore, d(c, c') > v + 1. Without loss of generality, we 
may hence assume that c and c' differ in exactly the first d = d(C) positions, 
where v + 1 < d < 2v. If the word 

X = Xi X v : V„ + i Xg X d+ 1 • • ■ ■ ■ ■ X„ 

agree with c! agree with c agree with both 

is received, then we have 

d(x, c') = d — v < v = d(x, c). 

It follows that either d(x, c') < d(x, c), in which case x is decoded incorrectly 
as c', or d(x, c) = d(x, c'), in which case a tie is reported. □ 


Exercises 

2. 1 Explain why the binary communication channel shown in Fig. 2.3, where 
p < 0.5, is called a useless channel. 

2.2 Suppose that codewords from the binary code {000, 100, 111} are being 
sent over a BSC (binary symmetric channel) with crossover probability 
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p = 0.03. Use the maximum likelihood decoding rule to decode the 
following received words: 

(a) 010, (b) Oil, (c) 001. 

2.3 Consider a memoryless binary channel with channel probabilities 

V(0 received | 0 sent) = 0.7, V{\ received | 1 sent) = 0.8. 

If codewords from the binary code {000, 100, 111} are being sent over 
this channel, use the maximum likelihood decoding rule to decode the 
following received words: 

(a) 010, (b) Oil, (c) 001. 

2.4 Let C = {001, 011} be a binary code. 

(a) Suppose we have a memoryless binary channel with the following 
probabilities: 

V(() received | 0 sent) = 0.1 and V(l received | 1 sent) = 0.5. 

Use the maximum likelihood decoding rule to decode the received 
word 000. 

(b) Use the nearest neighbour decoding rule to decode 000. 

2.5 For the binary code C = {01101, 00011, 10110, 11000}, use the nearest 
neighbour decoding rule to decode the following received words: 

(a) 00000, (b) 01111, (c) 10110, (d) 10011, (e) 11011. 

2.6 For the ternary code C = {00122, 12201, 20110, 22000}, use the nearest 
neighbour decoding rule to decode the following received words: 

(a) 01122, (b) 10021, (c) 22022, (d) 20120. 

2.7 Construct the IMLD (incomplete maximum likelihood decoding) table for 
each of the following binary codes: 

(a) C = {101, 111,011}, 

(b) C = {000,001,010,011}. 

2.8 Determine the number of binary codes with parameters (n, 2, n) for n > 2. 




3 Finite fields 


From the previous chapter, we know that a code alphabet A is a finite set. In 
order to play mathematical games, we are going to equip A with some algebraic 
structures. As we know, a field, such as the real field R or the complex field C, 
has two operations, namely addition and multiplication. Our idea is to define 
two operations for A so that A becomes a field. Of course, then A is a field 
with only finitely many elements, whilst R and C are fields with infinitely many 
elements. Fields with finitely many elements are quite different from those that 
we have leamt about before. 

The theory of finite fields goes back to the seventeenth and eighteenth cen- 
turies, with eminent mathematicians such as Pierre de Fermat (1601-1665) 
and Leonhard Euler (1707-1783) contributing to the structure theory of special 
finite fields. The general theory of finite fields began with the work of Carl 
Friedrich Gauss (1777-1855) and Evariste Galois (1811-1832), but it only be- 
came of interest for applied mathematicians and engineers in recent decades 
because of its many applications to mathematics, computer science and com- 
munication theory. Nowadays, the theory of finite fields has become very rich. 
In this chapter, we only study a small portion of this theory. The reader already 
familiar with the elementary properties of finite fields may wish to proceed 
directly to the next chapter. For a more complete introduction to finite fields, 
the reader is invited to consult ref. [11]. 


3.1 Fields 

Definition 3.1.1 A field is a nonempty set F of elements with two operations 
*+’ (called addition) and (called multiplication) satisfying the following 
axioms. For all a,b,c e F: 
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X 

0 1 

"o" 

1 

0 0 
0 1 


+ 

0 1 

0 

1 

0 1 
1 0 


Fig. 3.1. Addition and multiplication tables for Z 2 . 

(i) F is closed under + and • ; i.e., a + b and a ■ b are in F. 

(ii) Commutative laws: a + b = b + a, a ■ b = b ■ a. 

(iii) Associative laws: (a + b) + c = a + (b + c), a ■ {b • c) = {a • b) • c. 

(iv) Distributive law: a • (b + c) = a ■ b + a - c. 

Furthermore, two distinct identity elements 0 and 1 (called the additive and 
multiplicative identities, respectively) must exist in F satisfying the following: 

(v) a + 0 = a for all a e F. 

(vi) a ■ 1 = a and a ■ 0 = 0 for all a e F. 

(vii) For any a in F, there exists an additive inverse element (—a) in F such 
that a + (—a) = 0. 

(viii) For any a ^ 0 in F, there exists a multiplicative inverse element a -1 in 
F such that a ■ a~ x = 1. 

We usually write a ■ b simply as ab, and denote by F* the set F\{0}. 

Example 3.1.2 (i) Some familiar fields are the rational field 

Q := | ^ : a, b are integers with b / oj , 

the real field R and the complex field C. It is easy to check that all the axioms 
in Definition 3.1.1 are satisfied for the above three fields. In fact, we are not 
interested in these fields because all of them have an infinite number of elements. 

(ii) Denote by Z 2 the set {0, 1}. We define the addition and multiplication 
as in Fig. 3.1. 

Then, it is easy to check that Z 2 is a field. It has only two elements! 

More properties of a field can be deduced from the definition. 

Lemma 3.1.3 Let a, b be any two elements of a field F . Then 

(i) (-1) -a = -a; 

(ii) ab = 0 implies a = 0 or b = 0. 
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Proof, (i) We have 

, . (vi), . (ii).(iv) , (vii) (ii),(vi) 

(-1 )-a+a — (— l)-a+a-l = ((-l)+l)-a = 0 -a = 0, 

where the Roman numerals in the above formula stand for the axioms in 
Definition 3.1.1. Thus, (— 1) • a = —a. 

(ii) If a ^ 0, then 

„ (vi) _! , , . (iii) . _! ,, (ii),(viii) (ii) (vi) 

0 = a -0 = a ( ab ) = (a a)h = 1 • b = b ■ 1 = b, 

where the Roman numerals in the above formula again stand for the axioms in 
Definition 3.1.1. fli : 

A field containing only finitely many elements is called a finite field. A set 
F satisfying axioms (i)-(vii) in Definition 3.1.1 is called a ( commutative ) ring. 


Example 3.1.4 (i) The set of all integers 

Z := {0, ±1, ±2, . . .} 

forms a ring under the normal addition and multiplication. It is called the 
integer ring. 

(ii) The set of all polynomials over a field F, 

F[x ] := {ao + a\x + • • ■ + a n x" : a, gf,/i> 0}, 
forms a ring under the normal addition and multiplication of polynomials. 


Definition 3.1.5 Let a, b and m > 1 be integers. We say that a is congruent to 
b modulo m , written as 


a = b (mod m), 

if m | (a — b ); i.e., m divides a — b. 

Example 3.1.6 

(i) 90 = 30 (mod 60) and 15 = 3 (mod 12). 

(ii) a = 0 (mod m) means that m \ a. 

(iii) a = 0 (mod 2) means that a is even. 

(iv) a = 1 (mod 2) means that a is odd. 


Remark 3.1.7 Given integers a and m > 1, by the division algorithm we have 
a = mq + b, (3.1) 
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Fig. 3.2. Addition and multiplication tables for Z 4 . 

where b is uniquely determined by a and m, and 0 < b < m — 1. Hence, any 
integer a is congruent to exactly one of 0, 1, — 1 modulo m . The integer 

b in (3.1) is called the ( principal ) remainder of a divided by m, denoted by 
(a (mod m )). 

If a = b (mod m) and c = d (mod m ), then we have 

a + c = b + d (mod m), 
a — c = b — d (mod m), 
a x c = b x d (mod m). 

For an integer m > 1, we denote by Z m or Z /(m) the set {0, 1, .... m — 1} 
and define the addition © and multiplication O in Z m by: 

a ®b = the remainder of a + b divided by m, i.e., (a + b (mod m)), 
and 


a Ob — the remainder of ab divided by m, i.e., ( ab (mod m)). 

It is easy to show that all the axioms (i)-(vii) in Definition 3.1.1 are satisfied. 
Hence, Z m , together with the addition © and multiplication O defined above, 
forms a ring. 

We will continue to denote ‘©’ and ‘O’ in Z m by *+’ and ‘ \ respectively. 

Example 3.1.8 (i) Modulo 2: the field Z2 in Example 3.1.2(ii) is exactly the 
ring defined above for m = 2. In this case, axiom (viii) is also satisfied. Thus, 
it is a field. 

(ii) Modulo 4: we construct the addition and multiplication tables for Z4 
(Fig. 3.2). From the multiplication table in Fig. 3.2, we can see that Z4 is not 
a field since 2 _1 does not exist. 

We find from the above example that Z m is a field for some integers m and 
is just a ring for other integers. In fact, we have the following pleasing result. 
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Theorem 3.1.9 Z m is a field if and only ifm is a prime. 

Proof. Suppose that m is a composite number and let m = ab for two integers 
1 < a,b < m. Thus, a ^ 0, b ^ 0. However, 0 = m — a ■ b in Z m . This is a 
contradiction to Lemma 3.1.3(ii). Hence, Z m is not a field. 

Now let m be a prime. For any nonzero element a e Z m ,i.e.,0 < a < m, we 
know that a is prime tom. Thus, there exist two integers u, v with 0 < u < m — \ 
such that ua + vm = 1, i.e., ua = 1 (mod m). Hence, u = a -1 . This 
implies that axiom (viii) in Definition 3.1.1 is also satisfied and hence Z m is a 
field. □ 

For a ring R, an integer n > 1 and a £ R, we denote by na or n ■ a the 
element 



Definition 3.1.10 Let F be a field. The characteristic of F is the least positive 
integer p such that p ■ 1 = 0, where 1 is the multiplicative identity of F. If no 
such p exists, we define the characteristic to be 0. 

Example 3.1.11 (i) The characteristics of Q, R, C are 0. 

(ii) The characteristic of the field Z p is p for any prime p. 

It follows from the following result that the characteristic of a field cannot 
be composite. 

Theorem 3.1.12 The characteristic of a field is either 0 or a prime number. 

Proof. It is clear that 1 is not the characteristic as 1 • 1 = 1 0. 

Suppose that the characteristic p of a field F is composite. Let p — nm for 
some positive integers 1 < n,m < p. Put a = n ■ I and b = m ■ 1, where 1 is 
the multiplicative identity of F. Then, 


a ■ b = (n ■ 1 )(m ■ 1) 




p- 1=0. 


By Lemma 3.1.3(ii), a — 0 or b — 0; i.e., m ■ 1 = 0 or n ■ 1 = 0. This 
contradicts the definition of the characteristic. □ 

Let E, F be two fields and let F be a subset of E. The field F is called a 
subfield of E if the addition and multiplication of E, when restricted to F, are 
the same as those of F. 
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Example 3.1.13 (i) The rational number field Q is a subfield of both the real 
field R and the complex field C, and R is a subfield of C. 

(ii) Let F be a field of characteristic p; then, Z p can be naturally viewed as 
a subfield of F. 

Theorem 3.1.14 A finite field F of characteristic p contains p n elements for 
some integer n > 1 . 

Proof. Choose an element a\ from F*. We claim that 0 • a\, 1 • ai, . . . , 
(p — 1) • a i are pairwise distinct. Indeed, if i ■ a i = j ■ ai for some 0 < i < 
j < p — 1, then ( j — i) ■ (X\ =0 and 0 < j — i < p — 1. As the characteristic 
of F is p, this forces j — i = 0; i.e., i = j. 

If F — {0 • ai , 1 • ai , . . . , (p — 1) • a\}, we are done. Otherwise, we choose 
an element c *2 in F \{ 0 • ai, 1 • ari , . . . , (p — 1) • ai}. We claim that a\(X\ + 
a 2&2 are pairwise distinct for all 0 < a\, 02 < p — 1. Indeed, if 

a\a\ + 02012 = b\ct\ + b20t2 (3.2) 

forsomeO < a \ , < 22 , b \ , b 2 < p— 1, then we must have 02 = b 2 ■ Otherwise, we 
would have from (3.2) that 012 = Q >2 — a 2 )~ l {a\ — b\)a\ . This is a contradiction 
to our choice of 012 . Since ao = b 2 , it follows immediately from (3.2) that 
(ai, af) = ( b \ , bi). As F has only finitely many elements, we can continue in 
this fashion and obtain elements «i, . . . , a„ such that 

a,- e E\{aiai + • • • + : a \, . . . , a,_i e Z p j for all 2 < i < n, 

and 

F = {aiai H h a n a n : a\ a n e Z p }. 

In the same manner, we can show that a\oii H + a„a n are pairwise distinct 

for all ai e Z ;) , i = 1 This implies that F = p n . □ 


3.2 Polynomial rings 

Definition 3.2.1 Let F be a field. The set 

F[x] := | OjX 1 : aj € F,n>0 

is called the polynomial ring over F . (F is indeed a ring, namely axioms 
(i)-(vii) of Definition 3.1.1 are satisfied.) An element of E[x] is called a 
polynomial over F . For a polynomial f(x) = X^=o a>x ' > the integer n is called 
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the degree of fix), denoted by deg(/(x)), if a n / 0 (for convenience, we 
define deg(O) = — oo). Furthermore, a nonzero polynomial / (x) = YH=o a i x ‘ 
of degree n is said to be monic if a n = 1. A polynomial fix) of positive 
degree is said to be reducible (over F) if there exist two polynomials gix) 
and h(x ) over F such that deg(g(x)) < deg(/(x)), deg (h(x)) < deg (/(x)) and 
fix) = g(x)h(x). Otherwise, the polynomial fix) of positive degree is said to 
be irreducible (over F). 

Example 3.2.2 (i) The polynomial fix) = x 4 + 2x 6 e Z 3 [x] is of degree 6. 
It is reducible as fix) = x 4 (l + 2x 2 ). 

(ii) The polynomial g(x) = 1 + x + x 2 e Z 2 [x] is of degree 2. It is 
irreducible. Otherwise, it would have a linear factor x or x + 1; i.e., 0 or 1 
would be a root of g(x), but #(0) = g(l) = I e Z 2 . 

(iii) Using the same arguments as in (ii), we can show that both 1 + x + x 3 
and 1 + x 2 + x 3 are irreducible over Z 2 as they have no linear factors. 

Definition 3.2.3 Let fix) e F\x] be a polynomial of degree n> 1. Then, for 
any polynomial gix) e F [a], there exists a unique pair (s(x), r(x)) of polynomi- 
als with deg(r (x)) < deg(/(x)) or r (x ) = 0 such that gix) = s(x)f(x) + r(x). 
The polynomial r(x) is called the ( principal ) remainder of g(x) divided by 
f{x), denoted by (g(x) (mod fix))). 

For example, let f(x) = I + x 2 and g(x) = x + 2x 4 be two polynomials in 
Z 5 [x], Since we have ^(x) = x + 2x 4 = (3 + 2x 2 )(l + x 2 ) + (2 + x) = 
(3 + 2 x 2 )/(x) + (2 + x), the remainder of g(x) divided by f(x) is 2 + x. 
Analogous to the integral ring Z, we can introduce the following notions. 

Definition 3.2.4 Let f(x),g(x) e F[x] be two nonzero polynomials. The 
greatest common divisor of /(x), g(x), denoted by gcd(/(x), g(x)), is the 
monic polynomial of the highest degree which is a divisor of both f(x) 
and gix). In particular, we say that /(x) is co-prime (or prime) to gix) 
if gcd(/(x), gix)) — 1. The least common multiple of fix), gix), denoted 
by lcm(/(x), g(x)), is the monic polynomial of the lowest degree which is a 
multiple of both fix) and g(x). 

Remark 3.2.5 (i) If fix) and g(x) have the following factorizations: 

fix) = a ■ pilx) ei ■ ■ ■ Pnlx) 6 ", gix) = b ■ pilx) dl ■ ■ ■ p n lx) d \ 

where a,b e F*, e, , d, > 0 and p,(x) are distinct monic irreducible poly- 
nomials (the existence and uniqueness of such a polynomial factorization are 
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Table 3.1. Analogies between Z and Fix]. 


The integral ring Z The polynomial ring F[x] 

An integer m A polynomial f(x) 

A prime number p An irreducible polynomial p(x) 


Table 3.2. More analogies between Z and F[x]. 


Z m ={0, 1, ...,m - 1} 
a ® b := (a + b (mod m)) 
a Qb := ( ab (mod m )) 

Z m is a ring 

Z m is a field m is a prime 


F [*]/(/(*)) := <**' : ai eF,n> 1} 

g(x ) ® /fix) := (g(x) + /fix) (mod /(x))) 
g(x) © /fix) := (*(x)A(x) (mod /(x))) 
Fix]/ (fix)) is a ring 

F[a']/(/(x)) is a field /(x) is irreducible 


well-known facts, cf. Theorem 1.59 of ref. [11]), then 

gcd(/(x), g(x)) = Pl (x) miD{eiM ■ ■ ■ p^x)™^ 


and 


lcm(/(x), *(*)) = Pl (x) mm{e ' M • ■ ■ p n (x) mm[eM . 

(ii)Let /(x), g(x) e F [a ] be two nonzero polynomials. Then there exist two 
polynomials u(x), v(x) with deg(«(x)) < deg(g(x)) and deg(u(x)) < deg(/(x)) 
such that 


gcd(/(x), g(x)) = u(x)f(x) + v(x)g(x). 

(iii) From (ii), it is easily shown that gcd (f(x)h(x), g(x)) = gcd (f(x), g(x)) 
ifgcd(fi(x), g(x)) = L 

There are many analogies between the integral ring Z and a polynomial ring 
F\x], We list some of them in Table 3.1. 

Apart from the analogies in Table 3.1, we have the division algorithm, great- 
est common divisors, least common multiples, etc., in both rings. Since, for 
each integers > 1 of Z, the ring Z m = Z/(»j) is constructed, we can guess that 
the ring, denoted by F[x]/(/(x)), can be constructed for a given polynomial 
/(x) of degree n > 1. We make up Table 3.2 to define the ring F[x]/(f(x)) 
and compare it with Z/(m). 

We list the last two statements in the second column of Table 3.2 as a theorem. 
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Fig. 3.3. Addition and multiplication tables for Z2[x]/(1 + x 2 ). 


Theorem 3.2.6 Let fix ) be a polynomial over a field F of degree >1. Then 
F\x]/(f(x)), together with the addition and multiplication defined in Table 3.2, 
forms a ring. Furthermore, F\x]/(f(x)') is a field if and only if f(x) is irre- 
ducible. 

Proof. It is easy to verify that + [x]/(/(x)) is a ring. By applying exactly the 
same arguments as in the proof of Theorem 3.1.9, we can prove the second 
part. □ 


Remark 3.2.7 (i) We will still denote ‘®’ and ‘O’ in F[x\/(f(x)) by *+’ and 
‘ - respectively. 

(ii) If f{x) is a linear polynomial, then the field F[x]/(f(x)) is the field F 
itself. 


Example 3.2.8 (i) Consider the ring R[x]/(1 + x 2 ) = [a +bx : a, b e R}. It 
is a field since 1 + x 2 is irreducible over R. In fact, it is the complex field C! 
To see this, we just replace x in R[x]/(1 + x 2 ) by the imaginary unit i. 

(ii) Consider the ring 

Z 2 [x]/(1 + x 2 ) = {0, l,x,l + xj. 

We construct the addition and multiplication tables as shown in Fig. 3.3. We 
see from the multiplication table in Fig. 3.3 that Z 2 [x]/( I + x 2 ) is not a field 
as (1 + x)(l +x) = 0. 

(iii) Consider the ring 

Z 2 [x]/(1 +x+x 2 )= {0, 1, x, 1 + jc}. 

As 1 + x + x 2 is irreducible over Z 2 , the ring Z 2 [x]/(1 + x + x 2 ) is in fact 
a field. This can also be verified by the addition and multiplication tables in 
Fig. 3.4. 
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+ 

0 

1 

X 

l + x 

0 

0 

1 

X 

l+X 

1 

1 

0 

1 + X 

X 

X 

X 

l+X 

0 

1 

1 +X 

l+X 

X 

1 

0 


X 

0 

i 

X 

l+x 

0 

0 

0 

0 

0 

1 

0 

1 

X 

l+x 

X 

0 

X 

l+x 

1 

l+x 

0 

l+x 

1 

X 


Fig. 3.4. Addition and multiplication tables for Z2[x]/(1 + x+ x 2 ). 

3.3 Structure of finite fields 

Lemma 3.3.1 For every element fi of a finite field F with q elements, we have 

fa=fa 

Proof. It is trivial for the case where fi is zero. Now assume that f f 0. We label 
all the nonzero elements of F: F* = {fa, , fa-i}. Thus, F* = {flfa , . . . , 
Pfa-ib We obtain fa--- fa_ x = (fifa) ■ ■ ■ i.e., fa---fa-i = 

• • • fa- 1 ). Hence, fa~ l — 1. The desired result follows. □ 

Corollary 3.3.2 Let F be a subfield of E with I/ 7 ! = q. Then an element ft of 
E lies in F if and only if fa = fa 

Proof. This is clear from Lemma 3.3.1. 

‘-4=’ Consider the polynomial x q — x. It has at most q distinct roots in E 
(see Theorem 1.66 of ref. [1 1]). As all the elements of F are roots of x q — x, 
and | F \ — q , we obtain F = {all roots of x q — x in E}. Hence, for any f e E 
satisfying fa = ft, it is a root of x q — x; i.e., f lies in F. □ 

For a field F of characteristic p > 0, we can easily show that ( a + fa) p " = 
a pm + fa m for any a, e F and m > 0 (see Exercise 3.4(iii)). 

For two fields E and F, the composite field E ■ F is the smallest field 
containing both E and F. 

Using these results, we are ready to prove the main characterization of finite 
fields. 

Theorem 3.3.3 For any prime p and integer n> 1, there exists a unique finite 
field of p" elements. 

Proof. (Existence) Let /(x) be an irreducible polynomial over Z p (note that the 
existence of such a polynomial is guaranteed by Exercise 3.28(h) by showing 
that I p {ri) > 0 for all primes p and integers n > 0). It follows from Theorem 
3.2.6 that the residue ring Z p [x\/{f{x)) is in fact a field. It is easy to verify 
that this field has exactly p" elements. 
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(Uniqueness) Let E and F be two fields of p" elements. In the composite 
field E ■ F, consider the polynomial x p " — x over E ■ F . By Corollary 3.3.2, 
E = {all roots of x p " — x] = F. v 

From now on, it makes sense to denote the finite field with q elements by 
¥ q or GF(q). 

For an irreducible polynomial f{x) of degree n over a field F , let a be a root 
of f(x). Then the field F[x]/(f(x)) can be represented as 

F[a ] = {a 0 + aioc H b : a, e F } (3.3) 

if we replace x in F[x]/(f(x)) by a, as we did in Example 3.2.8(i). An 
advantage of using F[a ] to replace the field F[x]/(f(x)) is that we can avoid 
the confusion between an element of F[x]/{f{x)) and a polynomial over F . 

Definition 3.3.4 An element a in a finite field ¥ q is called a primitive element 
(or generator) of ¥ q if ¥ q = { 0 , a, a 2 , ... , a q ~ 1 }. 

Example 3.3.5 Consider the field F 4 = F 2 [«], where a is a root of the irre- 
ducible polynomial 1 + x + x 2 e F 2 U]. Then we have 

a 2 = — (1+ar) = 1 +a, a 3 = a(a 2 ) = a(l+a) = a + a 2 = a + l+a = 1 . 
Thus, F 4 = {0, a, 1 + a, 1} = {0, a, a 2 , a 3 }, so a is a primitive element. 

Definition 3.3.6 The order of a nonzero element a &¥ q , denoted by ord(a), 
is the smallest positive integer k such that a k — 1 . 

Example 3.3.7 Since there are no linear factors for the polynomial 1 + x 2 
over F 3 , 1 + x 2 is irreducible over F 3 . Consider the element a in the field 
F 9 = F 3 [a], where a is a root of 1 + x 2 . Then a 2 = —1, a 3 = o'(a 2 ) = —a 
and 

a 4 = (a 2 ) 2 = (— l ) 2 = 1 . 

This means that ord(a) = 4. 

Lemma 3.3.8 (i) The order ord(a) divides q — l for every a e F*. 

(ii) For two nonzero elements a, f e F*, (/gcd(ord(«), ord(/l)) = 1, then 
ord(a/f) = ord(a) x ord(/3). 

Proof, (i) Let m be a positive integer satisfying a m — 1. Write m = f/-ord(o')+ 
b for some integers a > 0 and 0 < b < ord(a). Then 

1 = a m = a a oliM+b = (a 0ld(a) ) a • a b = a b . 
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This forces b = 0; i.e., ord(a) is a divisor of m. Since a q 1 = 1, we obtain 
ord(a)|(tf - 1). 

(ii) Put r = ord(a) x ord(/J). It is clear that a r = 1 = ft as both ord(a) 
and ord(/i) are divisors of r. Thus, (aff = a' f' — 1. Therefore, ord(a/6) < 
ord (a) x ord()3). On the other hand, put t = orAiafi). We have 

\ — (Q;yg) ^ ' 0^d (“ , ^ a ord(a)y^r-ord(a) ^rord(a) 

This implies that ord(/3) divides t ■ ord(a) by the proof of part (i), so ord(/3) 
divides t as ord(a) is prime to ord(/3). In the same way, we can show that ord(a) 
divides t. This implies that ord(a) x ord(/3) divides t. Thus, ord(a/3) = t > 
ord(a) x ord(/3). The desired result follows. □ 

We now show the existence of primitive elements and give a characterization 
of primitive elements in terms of their order. 

Proposition 3.3.9 (i) A nonzero element of¥ q is a primitive element if and only 
if its order is q — 1 . 

(ii) Every finite field has at least one primitive element. 

Proof, (i) It is easy to see that a e F* has order q — 1 if and only if the 
elements a, a 2 , ... , a q ~ l are distinct. This is equivalent to saying that F ? = 
{0, a, a 2 , . . . , a q ~ 1 }. 

(ii) Let m be the least common multiple of the orders of all the elements of 
F*. If r k is a prime power in the canonical factorization of m, then r i |ord(a) 
for some a e F*. The order 0 f a ord( “ )/r ‘ is r k . Thus, if 



is the canonical factorization of m for distinct primes r\, ... ,r n , then for each 
i = 1 ,...,« there exists ft e F* with ord( ) = rf . Lemma 3.3.8(ii) implies 
that there exists f e F* with ord(/i) = m. Now m \(q — 1 ) by Lemma 3.3.8(i), 
and, on the other hand, all the q — 1 elements of F* are roots of the polynomial 
x m — 1 , so that m > q — 1 . Hence, ord ( fi ) = m = q — 1, and the result follows 
from part (i). □ 

Remark 3.3.10 (i) Primitive elements are not unique. This can be seen from 
Exercises 3.12-3.15. 

(ii) If a is a root of an irreducible polynomial of degree m over ¥ q , and 
it is also a primitive element of ¥ q m = ¥ q [a], then every element in ¥ q m can 
be represented both as a polynomial in a and as a power of a , since ¥ q m = 
{ao + a\a + • • • + a m - \<x m ~ x : a, e F ? } = {0, a, a 2 , . . . , a qm ~ 1 }. Addition 
for the elements of ¥ q m is easily carried out if the elements are represented 
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Table 3.3. Elements of F s . 


0 = 0 

1 = a 7 = a 0 a = a 1 

a 2 — a 2 

1 + a = a 3 

a + o' 2 = a 4 1 + a + a 2 = a 5 

1 + a 2 = a 6 


as polynomials in a, whilst multiplication is easily done if the elements are 
represented as powers of a. 


Example 3.3.11 Let a be a root of 1+ x + x 3 e F 2 [x] (by Example 3.2.2(iii), 
this polynomial is irreducible over F 2 ). Hence, Fg = F 2 [a]. The order of a is 
a divisor of 8 — 1 = 7. Thus, ord(a) = 7 and a is a primitive element. In fact, 
any nonzero element in Fg except 1 is a primitive element. We list a table (see 
Table 3.3) for the elements of Fg expressed in two forms. 

Based on Table 3.3, both the addition and multiplication in Fg can be easily 
implemented. We use powers of a to represent the elements in Fg. For instance, 

a 3 + a 6 = (1 + a) + (1 + a 2 ) = a + a 2 = a 4 , a 3 ■ a 6 = a 9 = a 2 . 

From the above example, we know that both the addition and multiplication can 
be carried out easily if we have a table representing the elements of finite fields 
both in polynomial form and as powers of primitive elements. In fact, we can 
simplify this table by using another table, called Zech’s log table, constructed 
as follows. 

Let a be a primitive element of F 9 . For each 0 < i < q — 2 or i = oo, we 
determine and tabulate z(i) such that 1 + a' = cr (,) (note that we set a 00 = 0). 
Then for any two elements a' and a ] with 0 < i < j < q — 2 in F ? , we obtain 

a'' + a 1 = a' (l + a j l ) = (mod ? " 1) , a 1 ■ a j = a i+1 (mod «“ 1) . 


Example 3.3.12 Let a be a root of 1 + 2x + x 3 e Fg[x]. This polynomial is 
irreducible over F 3 as it has no linear factors. Hence, F 2 7 = F 3 [a]. The order 
of a is a divisor of 27 — 1 = 26. Thus, ord(a) is 2, 13 or 26. First, ord(a) ^ 2; 
otherwise, a would be 1 or — 1 , neither of which is a root of 1 + 2x + x 3 . 
Furthermore, we have a 13 = — 1 ^ 1. Thus, ord(a) = 26 and a is a primitive 
element of F 2 7 . After some computation, we obtain a Zech’s log table for F 2 7 
with respect to a (Table 3.4). Now we can carry out operations in F 2 7 easily. 
For instance, we have 


« 7 + a 11 = « 7 (1 + a 4 ) = « 7 • a 18 = « 25 , a 1 ■ a 11 = a 18 . 
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Table 3.4. Zech's log table for F 2 7- 


1 

m 

i 

z(i) 

i 

Z(i) 

00 

0 

8 

15 

17 

20 

0 

13 

9 

3 

18 

7 

1 

9 

10 

6 

19 

23 

2 

21 

11 

10 

20 

5 

3 

1 

12 

2 

21 

12 

4 

18 

13 

00 

22 

14 

5 

17 

14 

16 

23 

24 

6 

11 

15 

25 

24 

19 

7 

4 

16 

22 

25 

8 


3.4 Minimal polynomials 

Let F ? be a subfield of F r . For an element a of F r , we are interested in nonzero 
polynomials fix) e F ? [x] of the least degree such that /(a) = 0. 


Definition 3.4.1 A minimal polynomial of an element a e F q m with respect to 
V q is a nonzero monic polynomial / (x) of the least degree in F ? [x] such that 
/(a) = 0. 


Example 3.4.2 Let a be a root of the polynomial 1+jc+x 2 eF 2 [x], It is 
clear that the two linear polynomials x and 1 + x are not minimal polynomials 
of a. Therefore, I + x + x 2 is a minimal polynomial of a. 

Since 1 + (1 + a) + (1 + a) 2 = l + l+ a + l+a 2 = 1 + a + a 2 = 0 and 
1 + a is not a root of x or 1 + x, 1 + .r + .v 2 is also a minimal polynomial of 
l+O!. 


Theorem 3.4.3 (i) The minimal polynomial of an element of¥ q m with respect 
to F 9 exists and is unique. It is also irreducible over ¥ q . 

(ii) If a monic irreducible polynomial M (x ) e F 9 [x] has a e F q m as a root, 
then it is the minimal polynomial of a with respect to ¥ q . 

Proof, (i) Let a be an element of F q m . As a is a root of x q "‘ — x, we know the 
existence of a minimal polynomial of a. 

Suppose that M\{x), LLOO e F ? [x] are both minimal polynomials of a. 
By the division algorithm, we have M\{x) = s(x)M 2 (x) + r(x) for some 
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polynomial r(x) with r(x) = 0 or deg(r(x)) < deg(M 2 ix)). Evaluating the 
polynomials at a, we obtain 0 = Mfa) = s(a)M 2 ia) + r(a) = ria). By 
the definition of minimal polynomials, this forces r(x) = 0; i.e., M 2 (x)\ M , (x). 
Similarly, we have M \ (x)\M 2 (x). Thus, we obtain M\ (x) = M 2 (x) since both 
are monic. 

Let Mix) be the minimal polynomial of a. Suppose that it is reducible. 
Then we have two monic polynomials fix) £ F ? [x] and g(x) £ F ? [x] such 
that deg(/ (a:)) < deg(M(x)), deg(g(x)) < deg(M(x)) and Mix) = f(x)g(x). 
Thus, we have 0 = Mia) = f(a)g(a), which implies that /(a) = 0 or 
g(a) = 0. This contradicts the minimality of the degree of Mix). 

(ii) Let fix) be the minimal polynomial of a with respect to F ? . By 
the division algorithm, there exist polynomials h(x),e(x) £ F ? [x] such that 
Mix) = h(x)f(x) + e(x) and degO(x)) < deg(/(x)). Evaluating the polyno- 
mials at a, we obtain 0 = Mia) = hia)fia) + eia) = eia). By the definition 
of the minimal polynomial, this forces e(x) = 0. This implies that fix) is the 
same as Mix) since Mix) is a monic irreducible polynomial and fix) cannot 
be a nonzero constant. This completes the proof. □ 

Example 3.4.4 Let fix) £ F ? [x\ be a monic irreducible polynomial of degree 
m. Let a be a root of fix). Then the minimal polynomial of a £ F q m with 
respect to F q is fix) itself. For instance, the minimal polynomial of a root of 
2+ x + x 2 £ F 3 [x] is 2 + x + x 2 . 

If we are given the minimal polynomial of a primitive element a £ F q m , we 
would like to find the minimal polynomial of a' , for any i. In order to do so, 
we have to start with cyclotomic cosets. 

Definition 3.4.5 Let n be co-prime to q. The cyclotomic coset of q (or q- 
cyclotomic coset) modulo n containing i is defined by 

Ci = {(i • q j (mod n)) £ Z„ : 7=0,1,.. .}. 

A subset 0'i, ...,/,} of Z„ is called a complete set of representatives of cyclo- 
tomic cosets of q modulo n if C,, , . . . , C it are distinct and = Z„. 

Remark 3.4.6 (i) It is easy to verify that two cyclotomic cosets are either equal 
or disjoint. Hence, the cyclotomic cosets partition Z„. 

(ii) If n = q m — 1 for some m > 1, each cyclotomic coset contains at most 
m elements, as q m = 1 (mod q m — 1). 

(iii) It is easy to see that, in the case of n = q"' — 1 for some m > 1 , |C,j = m 
if gcd(z', q m — 1) = 1. 
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Example 3.4.7 (i) Consider the cyclotomic cosets of 2 modulo 15: 

C 0 = {0}, Ci = {1,2, 4, 8}, C 3 = {3, 6, 9, 12}, 

C 5 = {5, 10}, C 7 = {7, 11, 13, 14). 

Thus, Ci = C2 = C4 = Cg, and so on. The set {0, 1, 3, 5, 7} is a complete set 
of representatives of cyclotomic cosets of 2 modulo 15. The set {0, 1, 6, 10, 7} 
is also a complete set of representatives of cyclotomic cosets of 2 modulo 15. 
(ii) Consider the cyclotomic cosets of 3 modulo 26: 

C 0 = {0}, Ci = {1,3, 9}, C 2 = {2,6, 18}, 

C 4 = {4, 12, 10}, C 5 = {5, 15, 19}, C 7 = {7,21, 11}, 

C 8 = {8, 24, 20}, C 13 = {13}, C M = {14, 16, 22}, 

C, 7 = {17,25. 23}. 

In this case, we have Ci = C 3 = Cg, and so on. The set {0, 1, 2, 4, 5, 7, 8, 
13, 14, 17} is a complete set of representatives of cyclotomic cosets of 3 
modulo 26. 

We are now ready to determine the minimal polynomials for all the elements 
in a finite field. 

Theorem 3.4.8 Let a be a primitive element of¥ q m. Then the minimal poly- 
nomial of a 1 with respect to V q is 

M (i \x) := Y\(x 

jec , 

where Ci is the unique cyclotomic coset ofq modulo q m — 1 containing i. 

Proof. Step 1: It is clear that a 1 is a root of M u> (x) as i e C ; . 

Step 2: Let M (,) (x) = ao + a\x 4 — • + a r x ' , where a k e F ? m andr = |C; . 
Raising each coefficient to its qt\\ power, we obtain 

a\ + a\x + • • • + a q r x r = ]”[ (* - oc qJ ) = Yl (x ~ 

= f](i-ffO = M ( 0 W. 

jeCi 

Note that we use the fact that C,- = C ql in the above formula. Hence, at = a\ 
for all 0 < k < r; i.e., at are elements of F ? . This means that M (l \x) is a 
polynomial over F ? . 

Step 3: Since a is a primitive element, we have a 1 f a k for two distinct 
elements j,k of C, . Hence, M (,, (x) has no multiple roots. Now let f(x) e 
F q [x~\ and /(a') = 0. Put 


fix ) = /o + fix H h f„x' 
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for some fk € V q . Then, for any j e C, . there exists an integer / such that 
j = iq l (mod q m — 1). Hence, 

f(a J ) = f(a' q ‘ ) = f 0 + f\a iql + h f n a mq ‘ 

= fo + f?<x iq ‘ + ' ' ' + //a"' V 

= (/o + /i« ! H h f n u m ) q ‘ 

= /(«y = o. 

This implies that M U] (x) is a divisor of /(x). 

The above three steps show that M u, (x) is the minimal polynomial 
of a'. 

Remark 3.4.9 (i) The degree of the minimal polynomial of a 1 is equal to the 
size of the cyclotomic coset containing i. 

(ii) From Theorem 3.4.8, we know that a 1 and a k have the same minimal 
polynomial if and only if i,k are in the same cyclotomic coset. 

Example 3.4.10 Let a be a root of 2 + x + x 2 e F 3 [x]; i.e., 

2 + a+a 2 = 0. (3.4) 

Then the minimal polynomial of a as well as a 3 is 2 + x + x 2 . The minimal 
polynomial of a 2 is 

M (2 ) (x) = n (x - «') = (X - a 2 )(x - a 6 ) = a 8 - (a 2 + a 6 )x + x 2 . 
jeC 2 

We know that a 8 = 1 as a e F9. To find M (2, (x), we have to simplify a 2 + a 6 . 
We make use of the relationship (3.4) to obtain 

a 2 + a 6 = (1 — a) + (1 — a) 3 = 2 — a — a 3 

= 2 — a — a(l — a) = 2 — 2a + a 2 = 0. 

Hence, the minimal polynomial of a 2 is 1 +x 2 . In the same way, we may 
obtain the minimal polynomial 2 + 2x + x 2 of cr\ 

The following result will be useful when we study cyclic codes in 
Chapter 7. 

Theorem 3.4.11 Let n be a positive integer with gcd(^, n) = I . Suppose that 
m is a positive integer satisfying n\(q m — 1). Let a be a primitive element 
of¥ q m and let M <1] (x) be the minimal polynomial of a J with respect to F (/ . 
Let {s\, . . . , s t } be a complete set of representatives of cyclotomic cosets of 
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q modulo n. Then the polynomial x n — 1 has the factorization into monic 
irreducible polynomials over ¥ q : 

x" - 1 = PI 

Proof. Putr = (q m — \)/n. Then a' is a primitive nth root of unity, and hence 
all the roots of x" — 1 are 1 , a r , a 2 '' , , a ( "~ ' ]r . Thus, by the definition of the 
minimal polynomial, the polynomials M t "~ > (x) are divisors of x n — 1, for all 
0 < i < n — 1 . It is clear that we have 

x n -l = lcm(M (0) (x), M (r) (x), M (2r \x), ..., M ((n ~ l)r \x)). 

In order to determine the factorization of x" — 1, it suffices to determine all 
the distinct polynomials among M m (x), M (r >(x). M <2r >(x), ..., M (in -' )r >(x). 
By Remark 3.4.9, we know that M (,r \x ) = M <jr >(x) if and only if ir and 
jr are in the same cyclotomic coset of q modulo q m — 1 = rn; i.e., i and j 
are in the same cyclotomic coset of q modulo n. This implies that all the 
distinct polynomials among M (0 \x), M (r> (x), M <2r> (x), ..., M ((n -' )r) (x) are 
M <s ' r) (x), M iS2r } (x ), . . . , M is,r, (x). The proof is completed. □ 

The following result follows immediately from the above theorem. 

Corollary 3.4.12 Let n be a positive integer with gcd (q,n) — 1. Then the 
number of monic irreducible factors of x n — 1 over F ? is equal to the number 
of cyclotomic cosets of q modulo n. 

Example 3.4.13 (i) Consider the polynomial x 13 — 1 over F 3 . It is easy to check 
that {0, 1, 2, 4, 7} is a complete set of representatives of cyclotomic cosets of 
3 modulo 13. Since 13 is a divisor of 3 3 — 1, we consider the field F27. Let 
a be a root of 1 + 2x + x 3 . By Example 3.3.12, a is a primitive element of 
F27. By Example 3.4.7(ii), we know all the cyclotomic cosets of 3 modulo 26 
containing multiples of 2. Hence, we obtain 

M (0 \x) = 2 + x, 

M (2 \x) = ]>- aJ ) = (* - ot 2 )(x - a 6 )(x - a 18 ) = 2 + x + x 2 + x 3 , 
jec 2 

M (4 \x) = HU- a ’ ) = (x ~ a 4 )(x - a 12 )(x - a 10 ) = 2 + x 2 +x 3 , 

jeC 4 

M m (x) = n«- = (x- ot s )(x - a 20 )(x - a 24 ) 

jeC s 

= 2 + 2x + 2x 2 + x 3 , 
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M (14) (x) = n < x - « ; ) = (x- a 14 )(x - a 16 )(x - a 22 ) = 2 + 2x + x 3 . 
jcCu 

By Theorem 3.4.11, we obtain the factorization of x 13 — 1 overF3 into monic 
irreducible polynomials: 

x 13 — 1 = M (f> \x)M {2 \x)M w (x)M (s \x)M (u \x) 

= (2 + x)(2 + x + x 2 + x 3 )(2 + x 2 + x 3 ) 
x(2 + 2x + 2x 2 + x 3 )(2 + 2x+ x 3 ). 

(ii) Consider the polynomial x 21 — 1 over F 2 . It is easy to check that 
{0, 1, 3, 5, 7, 9} is a complete set of representatives of cyclotomic cosets of 
2 modulo 21. Since 21 is a divisor of 2 6 — 1, we consider the field F 64 . Let 
a be a root of 1 + x + x 6 . It can be verified that a is a primitive element of 
F 64 (check that n 3 ^l,of 7 ^l,a 9 ^l and a 21 ^ 1). We list the cyclotomic 
cosets of 2 modulo 63 containing multiples of 3: 

C 0 = {0}, C 3 = {3, 6, 12, 24, 48, 33}, 

C 9 = {9, 18, 36}, C 15 = {15, 30, 60, 57, 51, 39}, 

C 2 i = {21, 42}, C 27 = {27, 54, 45}. 

Hence, we obtain 

M (0 \x) = 1+x, 

M (3 \x) = ]"~[ (x - ot j )=l+x + x 2 + x 4 + x 6 , 
jec, 

M (9 \x) = ]>- aJ ) = 1 + * 2 + X 3 , 

j€C 9 

M (I5 ) (x) = \[(x - <x j ) = \ + x 2 + x A + x 3 + x 6 , 

feCi 5 

M (21 ) (x) = Yl (x - « 7 ) = 1 + x + * 2 , 

j€C 2 1 

M (27 ) (x) = Y[ (x - « 7 ) = 1 + x + * 3 . 

76C27 

By Theorem 3.4.11, we obtain the factorization of x 21 — 1 overF 2 into monic 
irreducible polynomials: 

x 21 - 1 = M(°\x)M°\x)M (9 \x)M (15 \x)M (2l \x)M ( ' 21 \x) 

= (l+x)(l+x+x 2 + x 4 +x 6 )(l+x 2 + x 3 ) 

x (1 + x 2 + x 4 + x 5 + x 6 )(l + x + x 2 )(l + x + x 3 ). 
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Exercises 

3.1 Show that the remainder of every square integer divided by 4 is either 
0 or 1. Hence, show that there do not exist integers x and y such that 
x 2 + y 2 = 40403. 

3.2 Construct the addition and multiplication tables for the rings Z5 and Zg. 

3.3 Find the multiplicative inverse of each of the following elements: 

(a) 2, 5 and 8 in Z n , 

(b) 4, 7 and 11 in Z i7 . 

3.4 Let p be a prime. 

(i) Show that (p = 0 (mod p ) for any 1 < j < p — 1 . 

(ii) Show that ('p') = (—1) / (mod p) for any 1 < j < p — 1. 

(iii) Show that, for any two elements a, ft in a field of characteristic p, 
we have 

(a + P) pk = a pk + p pk 

for any k > 0. 

3.5 (i) Verify that fix) = x 5 — 1 e F31 [x] can be written as the product 

(x 2 -3x + 2)(x 3 + 3x 2 +7x+ 15). 

(ii) Determine the remainder of f(x) divided by x 2 — 3x + 2. 

(iii) Compute the remainders of f(x) divided by x 5 , x 1 and x 4 + 5x 3 , 
respectively. 

3.6 Verify that the following polynomials are irreducible: 

(a) 1 + x + x 2 + x 3 + x 4 , 1 + x + x 4 and 1 + x 3 + x 4 over F 2 ; 

(b) 1 + x 2 , 2 + x + x 2 and 2 + 2x + x 2 over F3. 

3.7 Every quadratic or cubic polynomial is either irreducible or has a linear 
factor. 

(a) Find the number of monic irreducible quadratic polynomials over F ? . 

(b) Find the number of monic irreducible cubic polynomials over F q . 

(c) Determine all the irreducible quadratic and cubic polynomials over 
F 2 . 

(d) Determine all the monic irreducible quadratic polynomials over F3. 

3.8 Let fix) = (2 + 2r 2 )(2 + x 2 + ;t 3 ) 2 (-l + x 4 ) e F 3 [jc] and g(*) = 
(1 + x 2 ){—2 + 2x 2 )(2 + x 2 + x 3 ) € F 3 [jc]. Determine gcd (f(x), g(x)) 
and lcm(/(x), g(x)). 

3.9 Find two polynomials u(x) and v(x) e F 2 [x] such that deg(«(x)) < 4, 
deg(u(x)) < 3 and 


u(x)(l + x + x 3 ) + v(x)(l + x + x 2 + x 3 + x 4 ) = 1. 
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3.10 Construct both the addition and multiplication tables for the ring 
F 3 [x]/(x 2 + 2). 

3.11 (a) Find the order of the elements 2, 7, 10 and 12 in F 17 . 

(b) Find the order of the elements a, a 3 , a + 1 and a 3 + 1 in F 16 , where 
a is a root of 1 + x + x 4 . 

3.12 (i) Let a be a primitive element of F ? . Show that a' is also a primitive 

element if and only if gcd(7, q — 1 ) = 1 . 

(ii) Determine the number of primitive elements in the following fields: 
F 9 , F 19 , F 2 5 andF 32 . 

3.13 Determine all the primitive elements of the following fields: F 7 , Fg and F 9 . 

3.14 Show that all the nonzero elements, except the identity 1, in F 128 are 
primitive elements. 

3.15 Show that any root of 1 + x + x 6 e F 2 [x] is a primitive element of F^. 

3.16 Consider the field with 16 elements constructed using the irreducible 
polynomial f(x) — 1 + x 3 + x 4 overF 2 . 

(i) Let a be a root of /(x). Show that a is a primitive element of F^. 
Represent each element both as a polynomial and as a power of a. 

(ii) Construct a Zech’s log table for the field. 

3.17 Find a primitive element and construct a Zech’s log table for each of the 
following finite fields: 

(a) F 3 2 , (b)F 25 , (c)F 5 2 . 

3.18 Show that each monic irreducible polynomial of ¥ q [x ] of degree m is the 
minimal polynomial of some element of F q m with respect to F ? . 

3.19 Let a be a root of 1 + x 3 + x 4 e F 2 [x], 

(i) List all the cyclotomic cosets of 2 modulo 15. 

(ii) Find the minimal polynomial of a' € Fi 6 , for all 1 < i < 14. 

(iii) Using Exercise 3.18, find all the irreducible polynomials of degree 
4 over F 2 . 

3.20 (i) Find all the cyclotomic cosets of 2 modulo 3 1 . 

(ii) Find the minimal polynomials of a, a 4 and a 5 , where a is a root of 
1 + x 2 + x 5 e F 2 [x], 

3.21 Based on the cyclotomic cosets of 3 modulo 26, find all the monic irre- 
ducible polynomials of degree 3 overF 3 . 

3.22 (i) Prove that, if A: is a positive divisor of m, then F p », contains a unique 

subfield with p k elements. 

(ii) Determine all the subfields in (a) F 2 i 2 , (b) F 2 i*. 

3.23 Factorize the following polynomials: 

(a) x 1 — 1 over F 2 ; (b) x 15 — 1 over F 2 ; 

(c) x 31 — 1 overF 2 ; (d) x 8 — 1 overF 3 ; 

(e) x 12 — 1 over F 5 ; (f) x 24 — 1 over F 7 . 
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3.24 Show that, for any given element c of F ? , there exist two elements a and 
b of F 9 such that a 2 + b 2 = c. 

3.25 For a nonzero element b of F p , where p is a prime, prove that the trinomial 
x p — x — b is irreducible in F,,» [x] if and only if n is not divisible by p. 

3.26 (Lagrange interpolation formula.) For n > 1, let oq, . . . , a n be n distinct 
elements of ¥ q , and let /h ..... /3„ be « arbitrary elements of F (/ . Show 
that there exists exactly one polynomial fix) e F ? [x] of degree < n — 1 
such that /(a,) = A for i = 1, . . . , n. Furthermore, show that this 
polynomial is given by 

f{x) = ~Tr~, fl(* - «*)> 

8' (OH) 

k^i 

where g'(x) denotes the derivative of g(x) := n" = i ( x ~ oik)- 

3.27 (i) Show that, for every integer n > 1, the product of all monic 

irreducible polynomials over F ? whose degrees divide n is equal 
tox 9 " - x. 

(ii) Let I q (d) denote the number of monic irreducible polynomials of 
degreed inF 9 [x], Show that 

q n — ^2 dlqid) for all n e N, 

d\n 

where the sum is extended over all positive divisors d of n. 

3.28 The Mobius function on the set N of positive integers is defined by 

1 if « = 1 

p,(n) = (—1/ if n is a product of k distinct primes 

0 if n is divisible by the square of a prime. 

(i) Let h and H be two functions from N to Z. Show that 

H(n) = J2 h ( d ) 

d\n 

for all n e N if and only if 

h(n) = J2d(d)H Q) 

d\n 

for all n e N. 

(ii) Show that the number I q (ri) of monic irreducible polynomials over 
F 9 of degree n is given by 

I q (n) = - 'y\d(d)q n/d - 
n ^ 
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A linear code of length n over the finite field V q is simply a subspace of the 
vector space ¥ n q . Since linear codes are vector spaces, their algebraic structures 
often make them easier to describe and use than nonlinear codes. In most of 
this book, we focus our attention on linear codes over finite fields. 


4.1 Vector spaces over finite fields 

We recall some definitions and facts about vector spaces over finite fields. While 
the proofs of most of the facts stated in this section are omitted, it should be 
noted that many of them are practically identical to those in the case of vector 
spaces over R or C. 

Definition 4.1.1 Let ¥ q be the finite field of order q. A nonempty set V, 
together with some (vector) addition + and scalar multiplication by elements 
of F 9 , is a vector space (or linear space ) over F ? if it satisfies all of the following 
conditions. For all u, v, w € V and for all X, p e ¥ q \ 

(i) u + v e V ; 

(ii) (u + v) + w = u + (v + w); 

(iii) there is an element 0 e V with the property 0 + v = v = v + 0 for all 
v e V; 

(iv) for each uef there is an element of V , called — u, such that u + (— u) = 
0 = (-u) + u; 

(v) u + v = v + u; 

(vi) Avef; 

(vii) A(u + v) = Au + Av, (A + /r)u = Au + pu; 

(viii) (Ajti)u = A(/au); 

(ix) if 1 is the multiplicative identity of F ? , then lu = u. 
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Let F" be the set of all vectors of length n with entries in F ? : 

= {Oi, v 2 , . .., v n ) : Vi e F ? }. 

We define the vector addition for F" componentwise, using the addition defined 
on F 9 ; i.e., if 

v = (ui,...,u„)eF" and w = (iui, . . . , w „ ) € F^, 

then 

v + w = (hi + u.’i v n + w n ) e F^. 

We also define the scalar multiplication for ¥" q componentwise; i.e., if 
v = Oi , . . . , v„) G F” and AeF ? , 

then 

X\ = (Xvi , . . . , Xv n ) e F^. 

Let 0 denote the zero vector (0, 0, . . . , 0) e FJJ . 

Example 4.1.2 It is easy to verify that the following are vector spaces over F ? : 

(i) (any q) C x = ¥ n q and C 2 = {0}; 

(ii) (any q) Cj«= {(A, .... A.) : X e F ? }; 

(iii) (q = 2) C 4 = {(0, 0, 0, 0), (1, 0, 1, 0), (0, 1, 0, 1), (1, L 1, 1)}; 

(iv) ( q = 3) C 5 = {(0, 0, 0), (0, 1, 2), (0, 2, 1)}. 

Remark 4.1.3 When no confusion arises, it is sometimes convenient to write 
a vector (m, v 2 , . . . , v n ) simply as v\v 2 • • • v„. 

Definition 4.1.4 A nonempty subset C of a vector space V is a subspace of V if 
it is itself a vector space with the same vector addition and scalar multiplication 
as V. 

Example 4.1.5 Using the same notation as in Example 4.1.2, it is easy to see 
that: 

(i) (any q) C 2 = {0} is a subspace of both C 3 and C 1 = F", and C 3 is a 
subspace of C\ = F"; 

(ii) ( q = 2) C4 is a subspace of Fj; 

(iii) {q = 3) C 5 is a subspace of F3. 
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Proposition 4.1.6 A nonempty subset C of a vector space V over F ? is a 
subspace if and only if the following condition is satisfied: 

ifx, y e C and then Ax+/ryeC. 

We leave the proof of Proposition 4.1.6 as an exercise (see Exercise 4.1). 
Note that, when q = 2, a necessary and sufficient condition for a nonempty 
subset C of a vector space V over F 2 to be a subspace is: if x, y e C, then 
x + y e C. 

Definition 4.1.7 Let V be a vector space over F ? . A linear combination of 

Vi, . . . ,\ r e V is a vector of the form A. 1 vi H |-A r v, where Ai, . . . , A r e F ? 

are some scalars. 

Definition 4.1.8 Let V be a vector space over F ? . A set of vectors {vi, . . . , v,} 
in V is linearly independent if 

A 1 V 1 H + A r v r = 0 =>• Ai = • • • = A r = 0. 

The set is linearly dependent if it is not linearly independent; i.e., if there are 
A|, . . . , A r € ¥ q , not all zero (but maybe some are!), such that A t v 1 +■••-§■ 
A r v r - 0. 

Example 4.1.9 (i) Any set S which contains 0 is linearly dependent. 

(ii) For any F q , {(0, 0, 0, 1), (0, 0, 1, 0), (0, 1, 0, 0)} is linearly independent. 

(iii) For any F q , {(0, 0, 0, 1), (1, 0, 0, 0), (1, 0, 0, 1)} is linearly dependent. 

Definition 4.1.10 Let V be a vector space over F 9 and let S = {vi* v %, . . . , v/J 
be a nonempty subset of V. The ( linear ) span of S is defined as 

<5’>={A 1 viH + A k \ k : A,- e F ? }. 

If S = 0, we define < S >= {0}. It is easy to verify that < S > is a subspace of 
V, called the subspace generated (or spanned) by S. Given a subspace C of L, 
a subset S of C is called a generating set (or spanning set ) of C if C =< S >. 

Remark 4.1.11 If S is already a subspace of V, then < S >= S. 

Example 4.1.12 (i) If q = 2 and S = {0001, 0010, 0100}, then 

< S >= {0000, 0001, 0010, 0100, 0011, 0101, 0110, 0111}. 

(ii) If q = 2 and S = {0001, 1000, 1001}, then 

<S >= {0000,0001, 1000, 1001}. 
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(iii) If q = 3 and S = {0001, 1000, 1001}, then 

< S >= {0000, 0001, 0002, 1000, 2000, 1001, 1002, 2001, 2002}. 


Definition 4.1.13 Let V be a vector space over F q . A nonempty subset B — 
{vi , V 2 , ■ . ■ , v/.} of V is called a basis for V if V = < B > and B is linearly 
independent. 


Remark 4.1.14 (i) If B = {vi, . . . , v*} is a basis of V, then any vector v e V 
can be expressed as a unique linear combination of vectors in B ; i.e., there exist 
unique A,t, A. 2 X k e F ? such that 

v = Mvi + A. 2 v 2 + b A*v*. 


(ii) A vector space V over a finite field F q can have many bases; but all bases 
contain the same number of elements. This number is called the dimension of 
V over F q , denoted by dim(L ). In the case where V can be regarded as a vector 
space over more than one field, the notation dimp^F) may be used to avoid 
confusion. 


Theorem 4.1.15 Let V be a vector space over¥ q . If d\m(V) = k, then 

(i) V has q k elements; 

(ii) V has ^ flf=o ifl k ~ d') different bases. 

Proof, (i) If {vi , . . . , \k\ is a basis for V, then 

V = {AjVi H + X k v k : A.i X* e F ? }. 

Since |F ? | = q, there are exactly q choices for each of a, , . . . , X k ; hence, V 
has exactly q k elements. 

(ii) Let B — {vi, . . . , v*-} denote a basis for V . Since vi f 0, there are 
q k — 1 choices for vi. For 6 to be a basis, the condition v 2 <f< vi > is needed, 
so there are q k — q choices for v 2 . Arguing in this manner, for every i such 
that k > i > 2, we need v,- <f< vi, . . . , v,-_i >, so there are q k — q‘~ l choices 
for V,. Hence, there are n?=o(9* — d') distinct ordered ^-tuples (vi, . . . , v*). 
However, since the order of Vi , . . . , v* is irrelevant for a basis, the number of 
distinct bases for V is ^ nf=o (d k ~ d')- 1=1 
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Example 4.1.16 Let q = 2, S = {0001, 0010, 0100} and V =< S >, then 

V = {0000, 0001,0010,0100, 0011,0101,0110,0111}. 

Note that S is linearly independent, so dim(E) = 3. We see that \V \ = 8 = 2 3 . 
By Theorem 4.1.15, the number of different bases for V is given by 

1 k ~ l 1 

- Y[( 2* - 2') = — (2 3 - 1)(2 3 - 2)(2 3 - 2 2 ) = 28. 

k - i = o 

Definition 4.1.17 Let v = («i, V 2 , ■ ■ ■ , v n ), w = (iui, W 2 , . . . , w n ) e ¥ n q . 

(i) The scalar product (also known as the dot product or the Euclidean inner 
product) of v and w is defined as 


v • w = V 1 W 1 + ■ ■ ■ + v„w n e F ? . 

(ii) The two vectors v and w are said to be orthogonal if v • w = 0. 

(iii) Let S be a nonempty subset of F£. The orthogonal complement S L of 
S is defined to be 


X 1 = {v e : v • s = 0 for all s e S’}. 

If S = 0, then we define S 3 - = ¥' l q . 

Remark 4.1.18 (i) It is easy to verify that is always a subspace of the vector 

space for any subset S of F”, and that < S > x = 5’ x . 

(ii) The scalar product is an example of an inner product on F" . An inner 
product on ¥ n q is a pairing (, ) : F” x F” — > V q satisfying the following 
conditions: for all u,v,weFJ, 

(a) (u + v, w) = (u, w) + (v, w); 

(b) (u, v + w) = (u, v) + (u, w); 

(c) (u, v) = 0 for all u e ¥ n q if and only if v = 0 ; 

(d) (u, v) = 0 for all v e ¥ q if and only if u = 0. 

The scalar product in Definition 4.1.17 is often called the Euclidean inner 
product. Some other inner products, such as the Hermitian inner product and 
symplectic inner product, are also used in coding theory (see Exercises 4.9- 
4.13). Throughout this book, unless it is otherwise specified, the inner prod- 
uct used is always assumed to be the scalar product, i.e., the Euclidean inner 
product. 
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Example 4.1.19 (i) Let q = 2 and let n - 4 . If u = (1, 1, 1, 1), v = 
( 1 , 1 , 1 , 0 ), w= ( 1 , 0 , 0 , 1 ), then 

u v = 1 • 1 + 1 ■ 1 + 1 • 1 + 1 - 0 = 1 , 
u w = 11 + 1 - 0 + 1 - 0 + 1 - 1 = 0 , 
vw = 1-1 + 1 - 0 + 1-0 + 01 = 1 . 

Hence, u and w are orthogonal. 

(ii) Let = 2andletS = {0100, 0101}. To find S x , let v = (tq, v 2 , U 3 , U 4 ) € 
S x . Then 


v • ( 0 , 1 , 0 , 0 ) = 0 =► v 2 = 0 , 
v • ( 0 , 1 , 0 , 1 ) = 0 + V 2 + V 4 = 0 . 

Hence, we have v 2 = V 4 = 0. Since v\ and 1)3 can be either 0 or 1, we can 
conclude that 

S x = {0000, 0010, 1000, 1010}. 

Theorem 4.1.20 Let S be a subset of V'f then we have 
dim(< S >) + dim(S x ) = n. 

Proof. Theorem 4.1.20 is obviously true when <S>= {0}. 

Now let dim(< S >) = k > 1 and suppose {vi , . . . , v*} is a basis of < S >. 
We need to show that dim(S x ) = dim(< S > x ) = n — k. 

Note that x e S ’ -1 if and only if 

Vt • x =•••=▼*■ x = 0, 

which is equivalent to saying that x satisfies dx T = 0, where A is the k x n 
matrix whose z th row is v, . 

The rows of A are linearly independent, so Ax T = 0 is a linear system of k 
linearly independent equations in n variables. From linear algebra, it is known 
that such a system admits a solution space of dimension n — k. □ 

Example 4.1.21 Let <7 = 2, n = 4 and S = {0100, 0101}. Then 
<S>= {0000,0100,0001,0101}. 

Note that S is linearly independent, so dim(< S >) = 2. We have computed 
that (Example 4.1.19) 


S x = {0000, 0010, 1000, 1010}. 
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Note that {0010, 1000} is a basis for S x , so dim(5 x ) = 2. Hence, we have 
verified that 

dim(< S >) + dim(S x ) = 2 + 2 = 4 = «. 


4.2 Linear codes 

We are now ready to introduce linear codes and discuss some of their elementary 
properties. 

Definition 4.2.1 A linear code C of length n over F 9 is a subspace of F'’ . 
Example 4.2.2 The following are linear codes: 

(i) C — {(a, k, . . . , a) : 1 e F ? ). This code is often called a repetition code 
(refer also to Example 1.0.3). 

(ii) (q = 2) C = {000, 001, 010, 011}. 

(iii) (q = 3) C = {0000, 1100, 2200, 0001, 0002, 1101, 1102, 2201, 2202}. 

(iv) (q = 2) C = {000,001,010,011, 100, 101, 110, 111}. 

Definition 4.2.3 Let C be a linear code in F". 

(i) The dual code of C is C x , the orthogonal complement of the subspace C 

ofF r 

(ii) The dimension of the linear code C is the dimension of C as a vector 
space over F q , i.e., dim(C). 

Theorem 4.2.4 Let C be a linear code of length n over F ? . Then, 

(i) |C| = <? dim(C) , i.e., dim(C) = log ? |C|; 

(ii) C x is a linear code and dim(C) + dim(C x ) = n; 

(iii) (C x ) x = C. 

Proof, (i) follows from Theorem 4. 1 . 15(i). 

(ii) follows immediately from Remark 4.1.18(i) and Theorem 4.1.20 with 
C = S. 

Using the equality in (ii) and a similar equality with C replaced by C x , we 
obtain dim(C) = dim((C x ) x ). To prove (iii), it therefore suffices to show that 
C c (C x ) x . 

Let c e C. To show that c e (C x ) x , we need to show that c • x = 0 for 
all x e C x . Since c e C and x e C x , by the definition of C x , it follows that 
c • x = 0. Hence, (iii) is proved. :|1| 
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Example 4.2.5 (i) (q = 2) Let C = {0000, 1010, 0101, 1111}, so dim(C) = 
log 2 |C| = log 2 4 = 2. It is easy to see that C x = {0000, 1010,0101, 1111} = 
C, so dim(C x ) = 2. In particular. Theorem 4.2.4(ii) and (iii) are verified. 

(ii) (q = 3) Let C = {000,001,002,010,020,011,012,021,022}, so 
dim(C) = log 3 |C| = log 3 9 = 2. One checks readily that C x = {000, 100, 
200}, so dim(C x ) = 1. 

Remark 4.2.6 A linear code C of length n and dimension k over ¥ q is often 
called a q- ary [n, A] -code or, if q is clear from the context, an [n, A] -code. It is 
also an ( n , (/^{-linear code. If the distance d of C is known, it is also sometimes 
referred to as an [n, k, ci]-linear code. 

Definition 4.2.7 Let C be a linear code. 

(i) C is self-orthogonal if C c C x . 

(ii) C is self-dual if C = C x . 

Proposition 4.2.8 The dimension of a self-orthogonal code of length n must 
be <n/2, and the dimension of a self-dual code of length n is n/2. 

Proof. This proposition is an immediate consequence of Theorem 4.2.4(ii) and 
the definitions of self-orthogonal and self-dual codes. □ 

Example 4.2.9 The code in Example 4.2.5(i) is self-dual. 


4.3 Hamming weight 

Recall that the Hamming distance d(x, y) between two words x, y e F}’ was 
defined in Chapter 2. 

Definition 4.3.1 Let x be a word in F" . The ( Hamming ) weight of x, denoted 
by wt(x), is defined to be the number of nonzero coordinates in x; i.e., 

wt(x) = d(x, 0 ), 

where 0 is the zero word. 


Remark 4.3.2 For every element x of F ? , we can define the Hamming weight 
as follows: 


wt(x) = d(x, 0) = 


if x ^0 
if x = 0. 
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Table 4.1. 


X 

y 

x*y 

wt(x) + wt(y) - 2wt(x * y) 

wt(x + y) 

0 

0 

0 

0 

0 

0 

i 

0 

1 

1 

1 

0 

0 

1 

1 

1 

i 

1 

0 

0 


Then, writing x e F" as x = (x\ , x 2 , . . . , x n ), the Hamming weight of x can 
also be equivalently defined as 

wt(x) = wt(xi) + wt(x 2 ) H 1- wt(x„). (4.1) 

Lemma 4.3.3 Ifx, y e V' q , then d(x, y) = wt(x — y). 

Proof. For x, y e F ? , d(x, y) = 0 if and only if x = y, which is true if and 
only if x — y = 0 or, equivalently, wt(x — y) = 0. Lemma 4.3.3 now follows 
from (2.1) and (4.1). □ 

Since a = —a for all a e F q when q is even, the following corollary is an 
immediate consequence of Lemma 4.3.3. 

Corollary 4.3.4 Let q be even. Ifx , y e F'f then d(x. y) = wt(x + y). 

For x = (xu x 2 , • ■ ■ , x n ) and y = (yi, y 2 , . . . , y n ) in F”, let 
x*y = (xiyi,x 2 y2, ■ ■ ■ , x n y n ). 

Lemma 4.3.5 Ifx, y e Fj, then 

wt(x + y) = wt(x) + wt(y) - 2wt(x ★ y). (4.2) 

Proof. From (4.1), it is enough to show that (4.2) is true for x, y e F 2 . This 
can be easily verified as in Table 4.1. □ 

Clearly, Lemma 4.3.5 implies that wt(x) + wt(y) > wt(x + y) for x, y e F". 
In fact, this inequality is true for any alphabet F ? . The proof of the following 
lemma is left as an exercise (see Exercise 4.18). 


Lemma 4.3.6 For any prime power q and x, y e F q ,we have 
wt(x) + wt(y) > wt(x + y) > wt(x) - wt(y). 


(4.3) 
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Definition 4.3.7 Let C be a code (not necessarily linear). The minimum 
( Hamming ) weight of C, denoted wt(C), is the smallest of the weights of the 
nonzero codewords of C. 

Theorem 4.3.8 Let C be a linear code over F ? . Then d(C) = wt(C). 

Proof. Recall that for any words x, y we have d(x, y) = wt(x — y). 

By definition, there exist x', y'eC such that d(x', y') = d(C), so 

d(C) = d(x , y') = wt(x' -y') > wt(C), 

since x'-/eC. 

Conversely, there is a z e C\{0} such that wt(C) = wt(z), so 

wt(C) = wt(z) = d( z, 0) > d{C). n 

Example 4.3.9 Consider the binary linear code C — {0000, 1000, 0100, 
1100}. We see that 

wt(1000) = 1, 
wt(0100) = 1, 
wt(1100) = 2. 

Hence, d(C)= 1. 

Remark 4.3.10 (Some advantages of linear codes.) The following are some 
of the reasons why it may be preferable to use linear codes over nonlinear ones: 

(i) As a linear code is a vector space, it can be described completely by using 
a basis (see Section 4.4). 

(ii) The distance of a linear code is equal to the smallest weight of its nonzero 
codewords. 

(iii) The encoding and decoding procedures for a linear code are faster and 
simpler than those for arbitrary nonlinear codes (see Sections 4.7 and 4.8). 


4.4 Bases for linear codes 


Since a linear code is a vector space, all its elements can be described in terms 
of a basis. In this section, we discuss three algorithms that yield either a basis 
for a given linear code or its dual. We first recall some facts from linear algebra. 
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Definition 4.4.1 Let A be a matrix over F ? ; an elementary row operation 
performed on A is any one of the following three operations: 

(i) interchanging two rows, 

(ii) multiplying a row by a nonzero scalar, 

(iii) replacing a row by its sum with the scalar multiple of another row. 

Definition 4.4.2 Two matrices are row equivalent if one can be obtained from 
the other by a sequence of elementary row operations. 

The following are well known facts from linear algebra: 

(i) Any matrix M over F 9 can be put in row echelon form (REF) or reduced 
row echelon form (RREF ) by a sequence of elementary row operations. In 
other words, a matrix is row equivalent to a matrix in REF or in RREF. 

(ii) For a given matrix, its RREF is unique, but it may have different REFs. 
(Recall that the difference between the RREF and the REF is that the 
leading nonzero entry of a row in the RREF is equal to 1 and it is the only 
nonzero entry in its column.) 

We are now ready to describe the three algorithms. 

Algorithm 4.1 

Input: A nonempty subset S of F" . 

Output: A basis for C = < S >, the linear code generated by S. 
Description: Form the matrix A whose rows are the words in S. Use 
elementary row operations to find an REF of A. Then the nonzero 
rows of the REF form a basis for C . 

Example 4.4.3 Let q = 3. Find a basis for C =<S>, where 
S = {12101,20110,01122, 11010}. 


/ 12101 \ 

/ 12101 \ 

/ 12101 \ 

20110 

02211 

01122 

01122 

01122 

00001 

\ 11010/ 

1,02212/ 

^ 00000 / 


The last matrix is in REF. By Algorithm 4.1, {12101, 01 122, 00001} is a basis 
forC. 
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Algorithm 4.2 

Input: A nonempty subset S of F". 

Output : A basis for C = < S >, the linear code generated by S. 
Description : Form the matrix A whose columns are the words in S. 
Use elementary row operations to put A in REF and locate the leading 
columns in the REF. Then the original columns of A corresponding to 
these leading columns form a basis for C. 

Example 4.4.4 Let q = 2. Find a basis for C =<S>, where 
S - {11101, 10110, 01011, 11010}. 


( 1101 ^ 


' 1101 > 


/ 1101 > 

1011 


0110 


0110 

1100 


0001 

r* 

0001 

0111 


0111 


0000 

1010 ) 


[oni) 


v 0000 ) 


Since columns 1, 2 and 4 of the REF are the leading columns, Algorithm 4.2 
says that columns l,2and4of AformabasisforC; i.e., {11101, 10110, 11010} 
is a basis for C . 


Remark 4.4.5 Note that the basis that Algorithm 4.2 yields is a subset of the 
given set S, while this is not necessarily the case for Algorithm 4. 1 . 

Algorithm 4.3 

Input: A nonempty subset S of F". 

Output: A basis for the dual code C x , where C = < S >. 

Description: Form the matrix A whose rows are the words in S. Use 
elementary row operations to place A in RREF. Let G be the k x n 
matrix consisting of all the nonzero rows of the RREF: 



(Here, O denotes the zero matrix.) 

The matrix G contains k leading columns. Permute the columns of 
G to form 


G' = (I k \X), 
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where h denotes the k x k identity matrix. Form a matrix H' as 
follows: 

H' - (-X T |/„_*) , 

where X T denotes the transpose of X. 

Apply the inverse of the permutation applied to the columns of G 
to the columns of H' to form H . Then the rows of H form a basis for 

c x . 

Remark 4.4.6 (i) Notice that Algorithm 4.3 also provides a basis for C since 
it includes Algorithm 4.1. 

(ii) An explanation of the principles behind Algorithm 4.3 is given in The- 
orem 4.5.9 in the following section. 

Example 4.4.7 Let q = 3. Find a basis for C x if the RREF of A is 

123456789 10 

/ 1 020020 1 0 2 \ 

0 0 0 1 0 1 0 0 0 1 

G = 000010020 0. 

000000100 1 
^0 00000001 2 y 

The leading columns of G are columns 1, 4, 5, 7 and 9. We permute the columns 
of G into the order 1, 4, 5, 7, 9, 2, 3, 6, 8, 10 to form the matrix 

145792368 10 
/l 00000221 2 n 

0 1 0 0 0 0 0 1 0 1 

G' = (I 5 \X)= 001000002 0 . 

000100000 1 
^0 00010000 2 y 

Form the matrix H' and finally rearrange the columns of H' using the inverse 
permutation to obtain H : 

145792368 10 

/oooooiooo o\ 
100000100 0 
//'= 120000010 0 , 
201000001 0 
^1 202 1 0000 1 , 
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H = 


1 2 3 

/0 1 0 

1 0 1 
1 0 0 

2 0 0 
^10 0 


4 5 6 7 8 
0 0 0 0 0 
0 0 0 0 0 
2 0 10 0 
0 10 0 1 
2 0 0 2 0 


9 

0 

0 

0 

0 

1 


By Algorithm 4.3, the rows of H form a basis for C x . 


10 

()\ 

0 

0 

0 

l ) 


4.5 Generator matrix and parity-check matrix 

Knowing a basis for a linear code enables us to describe its codewords explicitly. 
In coding theory, a basis for a linear code is often represented in the form of 
a matrix, called a generator matrix, while a matrix that represents a basis for 
the dual code is called a parity-check matrix. These matrices play an important 
role in coding theory. 

Definition 4.5.1 (i) A generator matrix for a linear code C is a matrix G whose 
rows form a basis for C . 

(ii) A parity-check matrix H for a linear code C is a generator matrix for the 
dual code C x . 

Remark 4.5.2 (i) If C is an [n, £]-linear code, then a generator matrix for C 
must be a kx n matrix and a parity-check matrix for C must be an (n-k)xn 
matrix. 

(ii) Algorithm 4.3 of Section 4.4 can be used to find generator and parity- 
check matrices for a linear code. 

(iii) As the number of bases for a vector space usually exceeds one, the num- 
ber of generator matrices for a linear code also usually exceeds one. Moreover, 
even when the basis is fixed, a permutation (different from the identity) of the 
rows of a generator matrix also leads to a different generator matrix. 

(iv) The rows of a generator matrix are linearly independent. The same 
holds for the rows of a parity-check matrix. To show that a k x n matrix G is 
indeed a generator matrix for a given [n, £]-linear code C, it suffices to show 
that the rows of G are codewords in C and that they are linearly independent. 
Alternatively, one may also show that C is contained in the row space of G. 

Definition 4.5.3 (i) A generator matrix of the form (/* |X) is said to be in 
standard form. 

(ii) A parity-check matrix in the form (Y | /„_* ) is said to be in standard form. 
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Lemma 4.5.4 Let C be an [n, k]-linear code over ¥ q , with generator matrix 
G. Then v £ belongs to if and only if\ is orthogonal to every row of 
G; i.e., v £ C x vG T = 0. In particular, given an (n — k) x n matrix H , 
then H is a parity-check matrix for C if and only if the rows of H are linearly 
independent and H G [ = O. 

Proof. Let r, denote the r'th row of G. In particular, r, £ C for all 1 < i < k, 
and every c £ C may be written as 

c = XjTx + • • • + X k r k , 


where Xi, . . . ,X k eF,. 

If v e C x , then v • c = 0 for all c £ C. In particular, v is orthogonal to r, , 
for all 1 <i<k; i.e., vG T = 0. 

Conversely, if v • r, = 0 for all 1 < i < k, then clearly, for any c = 
^i r i + ■ • • + X k r k £ C , 

V c = Xf\ • r,)-f 1- X k (\ ■ r k ) = 0. 

For the last statement, if H is a parity-check matrix for C, then the rows of 
H are linearly independent by definition. Since the rows of H are codewords 
in C x , it follows from the earlier statement that HG T = O. 

Conversely, if HG 1 = O, then the earlier statement shows that the rows 
of H, and hence the row space of H, are contained in C x . Since the rows 
of H are linearly independent, the row space of H has dimension n — k, so 
the row space of H is indeed C x . In other words, H is a parity-check matrix 
forC. □ 

Remark 4.5.5 An alternative but equivalent formulation for Lemma 4.5.4 is 
the following: 

Let C be an [n, k]-linear code over F q , with parity-check matrix H. Then 
v £ F” belongs to C if and only if\ is orthogonal to every row of H; i.e., 
v £ C O \H t = 0. In particular, given a k x n matrix G, then G is a 
generator matrix for C if and only if the rows of G are linearly independent 
and GH t = O. 

One of the consequences of Lemma 4.5.4 is the following theorem relating 
the distance d of a linear code C to properties of a parity-check matrix of C. 
When d is small, Corollary 4.5.7 can be a useful way to determine d. 


Theorem 4.5.6 Let C be a linear code and let H be a parity-check matrix for 
C. Then 
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(i) C has distance > d if and only if any d — 1 columns of H are linearly 
independent; and 

(ii) C has distance < d if and only if H has d columns that are linearly 
dependent. 

Proof. Let v = (v\, ... ,v n ) e C be a word of weight e > 0. Suppose 
the nonzero coordinates are in the positions i \, . . . , i e , so that Vj = 0 if j 
[i i, . . . , i e }. Let c, (1 < i < n) denote the /th column of H. 

By Lemma 4.5.4 (or, more precisely, its equivalent formulation in 
Remark 4.5.5), C contains a nonzero word v = (i>i , . . . , v n ) of weight e (whose 
nonzero coordinates are , . . . , v, e ) if and only if 

0 = \H t = v il H 1- 

which is true if and only if there are e columns of H (namely, c, , , . . . , c, e ) that 
are linearly dependent. 

To say that the distance of C is > d is equivalent to saying that C does not 
contain any nonzero word of weight < d — 1, which is in turn equivalent to 
saying that any < d — 1 columns of H are linearly independent. This proves (i). 

Similarly, to say that the distance of C is < d is equivalent to saying that C 
contains a nonzero word of weight < d, which is in turn equivalent to saying 
that H has < d columns (and hence d columns) that are linearly dependent. 
This proves (ii). □ 

An immediate corollary of Theorem 4.5.6 is the following result. 

Corollary 4 . 5.7 Let C be a linear code and let H be a parity-check matrix for 
C . Then the following statements are equivalent: 

(i) C has distance d; 

(ii) any d — 1 columns of H are linearly independent and H has d columns 
that are linearly dependent. 

Example 4 . 5.8 Let C be the binary linear code with parity-check matrix 

( 10100 \ 

11010 j . 

01001 / 

By inspection, it is seen that there are no zero columns and no two columns of 
H sum to 0 T , so any two columns of H are linearly independent. However, 
columns 1, 3 and 4 sum to 0 T , and hence are linearly dependent. Therefore, the 
distance of C is d = 3. 
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Theorem 4.5.9 If G = (h\X) is the standard form generator matrix of an 
[n, k]-code C , then a parity-check matrix for C is H = (— X^\I n -f). 

Proof. Obviously, the equation HG T = O is satisfied. By considering the 
last n — k coordinates, it is clear that the rows of H are linearly independent. 
Therefore, the conclusion follows from Lemma 4.5.4. □ 


Remark 4.5.10 Theorem 4.5.9 shows that Algorithm 4.3 of Section 4.4 actu- 
ally gives what it claims to yield. 


Example 4.5.11 Find a generator matrix and a parity-check matrix for the 
binary linear code C =<S>,whereS = {11101, 10110,01011, 11010}. 

By Algorithm 4. 1 , 


11101 \ 


/ 11101 \ 

/ 10001 

10110 


01011 

01011 

01011 


00111 

00111 

11010 ^ 


l. 00000 / 

l 00000 ) 


which is in RREF. By Algorithm 4.3, we have 


G = 


/100 

010 

\001 



0 1 
1 1 


1 1 0 \ 

10 1 /' 


Here, G is a generator matrix for C and H is a parity-check matrix for C. We 
can verify that GH T = O = HG T . 


It should be noted that it is not true that every linear code has a generator 
matrix in standard form. 


Example 4.5.12 Consider the binary linear code 

C = {000,001, 100, 101}. 

Since dim(C) = 2, by Theorem 4.1.15(ii) the number of bases for C is 
I(2 2 - 1)(2 2 - 2) = 3. 

We can list all the bases for C : 

{ 001 , 100 }, { 001 , 101 }, { 100 , 101 }. 
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Hence, C has six generator matrices: 


( 00l \ 

( 100 \ 

( 001 \ 

( I0l \ 

/kxa 

/ 10 A 

v ioo ; 


A ioi ) 

’ \001 / ’ 

l ioi )’ 

v ioo ; 


Note that none of these matrices is in standard form. 


4.6 Equivalence of linear codes 

While certain linear codes may not have a generator matrix in standard form, 
after a suitable permutation of the coordinates of the codewords and possibly 
multiplying certain coordinates with some nonzero scalars, one can always 
arrive at a new code which has a generator matrix in standard form. 

Definition 4.6.1 Two («, A/)-codes over V q are equivalent if one can be ob- 
tained from the other by a combination of operations of the following types: 

(i) permutation of the n digits of the codewords; 

(ii) multiplication of the symbols appearing in a fixed position by a nonzero 
scalar. 

Example 4.6.2 (i) Let q = 2 and n = 4. Choosing to rearrange the bits in the 
order 2, 4, 1, 3, we see that the code 

C = {0000,0101,0010,0111} 

is equivalent to the code 

C' = {0000, 1100, 0001,1101}. 

(ii) Let q = 3 and n = 3. Consider the ternary code 

C = {000,011,022}. 

Permuting the first and second positions, followed by multiplying the third 
position by 2, we obtain the equivalent code 

C' = {000, 102, 201}. 

Theorem 4.6.3 Any linear code C is equivalent to a linear code C' with a 
generator matrix in standard form. 

Proof. If G is a generator matrix for C , place G in RREF. Rearrange the columns 
of the RREF so that the leading columns come first and form an identity matrix. 
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The result is a matrix, G', in standard form which is a generator matrix for a 
code C equivalent to the code C . 

Remark 4.6.4 Theorem 4.6.3 is essentially the first part of Algorithm 4.3 of 
Section 4.4. 


Example 4.6.5 Let C be a binary linear code with generator matrix 

( 1100001 \ 

0010011 . 

0001001 / 


Rearranging the columns in the order 1, 3, 4, 2, 5, 6, 7 yields the matrix 


G' = 


( 100 
010 
001 


1001 \ 
0011 
0001 / 


Let C' be the code generated by G'; then C' is equivalent to C and C' has a 
generator matrix G', which is in standard form. 


Example 4.6.6 We saw in Example 4.5.12 that the binary linear code C = 
{000, 001, 100, 101} does not have a generator matrix in standard form. How- 
ever, if we permute the second and third coordinates, we obtain the equivalent 
binary linear code 

C' = {000,010, 100, 110}, 

and it is clear that 

100 
010 

is a generator matrix in standard form for C' . 


4.7 Encoding with a linear code 

Let C be an [n, k, d]-linear code over the finite field F ? . Each codeword of C 
can represent one piece of information, so C can represent q k distinct pieces 
of information. Once a basis {ri, . . . , r^-} is fixed for C, each codeword v, or, 
equivalently, each of the q k pieces of information, can be uniquely written as a 
linear combination, 


where u \, . . . , w* e F ? 


v = u\r\ + \-u k r k , 
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Equivalently, we may set G to be the generator matrix of C whose / th row 
is the vector r, in the chosen basis. Given a vector u = (u\. , u k ) £ , it is 

clear that 


v = uG = u ir x H 1- u k r k 


is a codeword in C . Conversely, any v e C can be written uniquely as v = uG, 
where u = (u\, , u k ) e F*. Hence, every word ueFj can be encoded as 
v = uG. 

The process of representing the elements u of F* as codewords v = uG in 
C is called encoding. 

Example 4.7.1 Let C be the binary [5, 3]-linear code with the generator matrix 

/ 10110 \ 

G = 01011 ; 

\ 00101 / 

then the message u = 101 is encoded as 

( 10110 \ 

01011 ] = 10011 . 

00101 / 

Note that the information rate of C is 3/5, i.e., only 3 bits out of 5 are used to 
carry the message. 

Remark 4.7.2 (Advantages of having G in standard form.) Some of the 
advantages of having the generator matrix of a linear code in standard form 
are as follows: 

(i) If a linear code C has a generator matrix G in standard form, G = (/ |X), 
then Algorithm 4.3 of Section 4.4 at once yields 

H = (-X T | /) 

as a parity-check matrix for C. 

(ii) If an [n,k, d] -linear code C has a generator matrix G in standard form, 
G = (7|X), then it is trivial to recover the message u from the codeword 
v = uG since 


v = uG = u(/|X) = (u, uX); 

i.e., the first k digits in the codeword v = uG give the message u - they 
are called the message digits. The remaining n — k digits are called check 
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digits. The check digits represent the redundancy which has been added 
to the message for protection against noise. 


4.8 Decoding of linear codes 

A code is of practical use only if an efficient decoding scheme can be applied 
to it. In this section, we discuss a rather simple but elegant nearest neighbour 
decoding for linear codes, as well as a modification that improves its perfor- 
mance when the length of the code is large. 


4.8.1 Cosets 

We begin with the notion of a coset. Cosets play a crucial role in the decoding 
schemes to be discussed in this chapter. 

Definition 4.8.1 Let C be a linear code of length n over F q , and let u e F ( " be 
any vector of length n; we define the coset of C determined by u to be the set 

C+u = {v + u:ve C}(= u + C). 


Remark 4.8.2 For the reader who knows some group theory, note that, by 
considering the vector addition, F" is a finite abelian group, and a linear code C 
over F 9 of length n is also a subgroup of F" . The coset of a linear code defined 
above coincides with the usual notion of a coset in group theory. 


Example 4.8.3 Let q = 2 and C = {000, 101, 010, 111}. Then 


c 

+ 

000 = 

{000, 

101, 

010, 

111}, 

c 

+ 

ooi m 

{001, 

100, 

Oil, 

110}, 

c 

+ 

010 = 

{010, 

111, 

000, 

101}, 

c 

+ 

Oil = 

{Oil, 

110, 

001, 

100}, 

c 

+ 

100 = 

{100, 

001, 

110, 

oil}, 

c 

+ 

101 = 

{101, 

000, 

111, 

010}, 

c 

+ 

no = 

{110, 

Oil, 

100, 

001}, 

c 

+ 

in = 

{111, 

010, 

101, 

000}. 


Note that 

c + 000 = C +010 = C + 101 = C + 111 = C; 

C + 001 = C + Oil = C + 100 = C + 110 = F l\c. 
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Theorem 4.8.4 LetC bean [n, k,d]-linear code over the finite field¥ q . Then, 

(i) every vector of ¥ n q is contained in some coset of C; 

(ii) for all u € F", \C + u| = \C\ = q k ; 

(iii) for all u, v e F" , u e C + v implies that C + u = C + v; 

(iv) two cosets are either identical or they have empty intersection ; 

(v) there are q"~ k different cosets ofC; 

(vi) for all u, v e K" , u — v e C if and only if a and v are in the same coset. 

Proof, (i) The vector v e is clearly contained in the coset C + v. 

(ii) By definition, C + u has at most |C| = q k elements. Clearly, two 
elements c + u and c' + u of C + u are equal if and only if c = c', hence 
\C+u\ = \C\=q k . 

(iii) It follows from the definition of C + v that C + u c C + v. Then, by 
(ii), C + u = C + v. 

(iv) Consider two cosets C + u and C + v and suppose x e (C + u) fl (C + v). 
Since xeC + u, (iii) shows that C + u = C + x. Similarly, since x e C + v, 
it follows that C + v = C + x. Hence, C + u = C + v. 

(v) follows immediately from (i), (ii) and (iv). 

(vi) If u — v = c e C, then u = c + v e C + v, so C + u = C + v. By the 

proof of (i), u e C + u and v e C + v, so u and v are in the same coset. 

Conversely, suppose u, v are both in the coset C + x. Then u = c + x and 

v = c' + x, for some c, c' e C. Hence, u — v = c — c' e C. □ 

Example 4.8.5 The cosets of the binary linear code 
C = {0000, 1011,0101, 1110} 

are as follows: 


0000 + C : 

0000 

1011 

0101 

1110 

0001 + C : 

0001 

1010 

0100 

mi 

0010 + C : 

0010 

1001 

0111 

1100 

1000 + C : 

1000 

0011 

1101 

0110 


Remark 4.8.6 The above array is called a ( Slepian ) standard array. 

Definition 4.8.7 A word of the least (Hamming) weight in a coset is called a 
coset leader. 

Example 4.8.8 In Example 4.8.5, the vector u in u + C of the first column are 
coset leaders for the respective cosets. Note that the coset 0001 + C can also 
have as coset leader 0100. 
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4.8.2 Nearest neighbour decoding for linear codes 

Let C be a linear code. Assume the codeword v is transmitted and the word w 
is received, resulting in the error pattern (or error string) 

e = w — vew + C. 

Then w — e = v e C, so, by part (vi) of Theorem 4.8.4, the error pattern e and 
the received word w are in the same coset. 

Since error patterns of small weight are the most likely to occur, nearest 
neighbour decoding works for a linear code C in the following manner. Upon 
receiving the word w, we choose a word e of least weight in the coset w + C 
and conclude that v = w — e was the codeword transmitted. 

Example 4.8.9 Let q = 2 and C = {0000, 1011, 0101, 1110}. Decode the 
following received words: (i) w = 1101; (ii) w = 1111. 

First, we write down the standard array of C (exactly the one in Example 
4.8.5): 


0000 + C : 

0000 

1011 

0101 

1110 

0001 + C : 

0001 

1010 

0100 

nil 

0010 + C : 

0010 

1001 

0111 

1100 

1000 + C : 

1000 

0011 

1101 

0110 


(i) w = 1101: w + C is the fourth coset. The word of least weight in this 
coset is 1000 (note that this is the unique coset leader of this coset). Hence, 
1 101 — 1000 = 1101 + 1000 = 0101 was the most likely codeword transmitted 
(note that this is the word at the top of the column where the received word 
1101 is found). 

(ii) w = 1111: w + C is the second coset. There are two words of smallest 
weight, 0001 and 0100, in this coset. (This means that there are two choices 
for the coset leader. In the array above, we have chosen 0001 as the coset 
leader. If we had chosen 0100, we would have obtained a slightly different 
array.) When the coset of the received word has more than one possible leader, 
the approach we take for decoding depends on the decoding scheme (i.e., in- 
complete or complete) used. If we are doing incomplete decoding, we ask for a 
retransmission. If we are doing complete decoding, we arbitrarily choose one 
of the words of smallest weight, say 0001, to be the error pattern, and conclude 
that 1111 — 0001 = 1111 + 0001 = 1110 was a most likely codeword sent. 
(Note: this means we choose 0001 as the coset leader, form the standard array 
as above, then observe that a most likely word sent is again found at the top of 
the column where the received word is located.) What happens if we choose 
0100 as the coset leader/error pattern? 
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4.8.3 Syndrome decoding 

The decoding scheme based on the standard array works reasonably well when 
the length n of the linear code is small, but it may take a considerable amount of 
time when n is large. Some time can be saved by making use of the syndrome 
to identify the coset to which the received word belongs. 


Definition 4.8.10 Let C be an [n, k, d] -linear code over F ? and let H be a 
parity-check matrix for C. For any w € F”, the syndrome of w is the word 
S(w) = w/f T e F' q ~ k . (Strictly speaking, as the syndrome depends on the 
choice of the parity-check matrix H, it is more appropriate to denote the 
syndrome of w by Sh( w) to emphasize this dependence. However, for 
simplicity of notation, the suffix H is dropped whenever there is no risk of 
ambiguity.) 


Theorem 4.8.11 Let C be an [n, k, d]-linear code and let H be a parity-check 
matrix for C. For u, v e F n q ,we have 

(i) S{ u + v) = S( u) + S(v); 

(ii) S(u) = 0 if and only if u is a codeword in C ; 

(iii) S(u) = Sly) if and only if u and v are in the same coset of C . 

Proof, (i) is an immediate consequence of the definition of the syndrome. 

(ii) By the definition of the syndrome, S(u) = 0 if and only if u H r — 0, 
which, by Remark 4.5.5, is equivalent toueC. 

(iii) follows from (i), (ii) and Theorem 4.8.4(vi). □ 

Remark 4.8.12 (i) Part (iii) of Theorem 4.8.11 says that we can identify a 
coset by its syndrome; conversely, all the words in a given coset yield the same 
syndrome, so the syndrome of a coset is the syndrome of any word in the coset. 
In other words, there is a one-to-one correspondence between the cosets and 
the syndromes. 

(ii) Since the syndromes are in F" - *, there are at most q"~ k syndromes. The- 
orem 4.8.4(v) says that there are q"~ k cosets, so there are q"~ k corresponding 
syndromes (all distinct). Therefore, all the vectors in ¥ n ~ k appear as syndromes. 


Definition 4.8.13 A table which matches each coset leader with its syndrome 
is called a syndrome look-up table. (Sometimes such a table is called a standard 
decoding array (SDA).) 
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Table 4.2. 


Coset leader u 

Syndrome S(u) 

0000 

00 

0001 

01 

0010 

10 

1000 

11 


Steps to construct a syndrome look-up table assuming complete nearest 
neighbour decoding 

Step 1: List all the cosets for the code, choose from each coset a word 
of least weight as coset leader u. 

Step 2: Find a parity-check matrix H for the code and, for each coset 
leader u, calculate its syndrome .S'(u) = u H r . 

Remark 4.8.14 For incomplete nearest neighbour decoding, if we find more 
than one word of smallest weight in Step 1 of the above procedure, place 
the symbol V in that entry of the syndrome look-up table to indicate that 
retransmission is required. 


Example 4.8.15 Assume complete nearest neighbour decoding. Construct a 
syndrome look-up table for the binary linear code 

C = {0000, 1011,0101, 1110}. 


From the cosets computed earlier, we choose the words 0000, 0001, 0010 and 
1000 as coset leaders. Next, a parity-check matrix for C is 


H = 


1 0 1 0 \ 

1 1 0 1 )' 


Now we construct a syndrome look-up table for C (Table 4.2). (We may also 
interchange the two columns.) Note that each word of length 2 occurs exactly 
once as a syndrome. 


Example 4.8.16 A syndrome look-up table for C , assuming incomplete nearest 
neighbour decoding, is given in Table 4.3. 


Remark 4.8.17 (i) Note that a unique coset leader corresponds to an error 
pattern that can be corrected, assuming incomplete nearest neighbour decoding. 
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Table 4.3. 


Coset leader u 

Syndrome S(u) 

0000 

00 

* 

01 

0010 

10 

1000 

11 


A coset leader (not necessarily unique) corresponds to an error pattern that can 
be corrected, assuming complete nearest neighbour decoding. 

(ii) A quicker way to construct a syndrome look-up table, given the parity- 
check matrix H and distance d for the code C , is to generate all the error patterns 
e with 

wt(e) < 

as coset leaders (cf. Exercise 4.44) and compute the syndrome 5(e) for each of 
them. 

Example 4.8.18 Assuming complete nearest neighbour decoding, construct 
a syndrome look-up table for the binary linear code C with parity-check 
matrix H, where 

( 10110 0 \ 

1110 10 . 

01 1001 / 

First, we claim that the distance of C is d = 3. This can be easily seen by 
applying Corollary 4.5.7 and observing that no two columns of H are linearly 
dependent while the second, third and fourth columns are linearly dependent. 

As l(d - 1)/2J = 1, all the error patterns with weight 0 or 1 will be coset 
leaders. We then compute the syndrome for each of them and obtain the first 
seven rows of the syndrome look-up table. Since every word of length 3 must 
occur as a syndrome, the remaining coset leader u has syndrome u H T =101. 
Moreover, u must have weight > 2 since all the words of weight 0 or 1 have 
already been included in the syndrome look-up table. Since we are looking for 
a coset leader, it is reasonable to start looking among the remaining words of 
the smallest available weight, i.e., 2. Doing so, we find three possible coset 
leaders: 000101, 001010 and 110000. Since we are using complete nearest 
neighbour decoding, we can arbitrarily choose 000101 as a coset leader and 
complete the syndrome look-up table (Table 4.4). 
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Table 4.4. 


Coset leader u 

Syndrome S(u) 

000000 

000 

100000 

no 

010000 

Oil 

001000 

111 

000100 

100 

000010 

010 

000001 

001 

000101 

101 


Table 4.2. Repeated from 

p. 63. 


Coset leader u Syndrome S(u) 


0000 

00 

0001 

01 

0010 

10 

1000 

11 


Note that, if incomplete nearest neighbour decoding is used, the coset leader 
000101 in the last row of Table 4.4 will be replaced by V. 

Decoding procedure for syndrome decoding 

Step 1: For the received word w, compute the syndrome ,S'(w). 

Step 2: Find the coset leader u next to the syndrome S( w) = S(u) in the 
syndrome look-up table. 

Step 3: Decode w as v = w — u. 

Example 4.8.19 Let q = 2 and let C = {0000, 1011, 0101, 1110}. Use the 
syndrome look-up table constructed in Example 4.8.15 to decode (i) w = 1101; 
(ii)w = 1111. 

Recall the syndrome look-up table constructed in Example 4.8.15 
(Table 4.2, repeated here for convenience). 

(i) w = 1 101. The syndrome is S(w) = w H J =11. From Table 4.2, we see 
that the coset leader is 1000. Flence, 1101 + 1000 = 0101 was a most likely 
codeword sent. 

(ii) w = 1111. The syndrome is S(w) = w H T = 01. From Table 4.2, we 
see that the coset leader is 0001. Flence, 1111 + 0001 = 1110 was a most likely 
codeword sent. 
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Exercises 


4. 1 Prove Proposition 4.1.6. 

4.2 For each of the following sets, determine whether it is a vector space over 
the given finite field F ? . If it is a vector space, determine the number of 
distinct bases it can have. 

(a) q = 2, S = {( a , b,c,d,e ) : a + b + c + d + e— 1}, 

(b) q = 3, T = {(.r, y, z, w) : xyzw = 0}, 

(c) q = 5, U = {(A. + [A, 2/r, 3A + v, v) : A, /r, v e F 5 }, 

(d) q prime, V = [(x l ,x 2 ,x 3 ) : xi = x 2 - x 3 }. 

4.3 For any given positive integer n and any 0 < k < n, determine the number 
of distinct subspaces of ¥ q of dimension k. 

4.4 (a) Let F 9 be a subfield of F r . Show that F r is a vector space over ¥ q , 

where the vector addition and the scalar multiplication are the same 
as the addition and multiplication of the elements in the field F,., 
respectively. 

(b) Let a be a root of an irreducible polynomial of degree m over ¥ q . 
Show that {1, a, a 2 , ... , a"' -1 } is a basis of ¥ q m over ¥ q . 

4.5 Define Tr Fj ,„ (a) = a + a q H — • + a qm 1 for any a e ¥ q ,« . The element 
Trp / F (a) is called the trace of a with respect to the extension ¥ q m/¥ q . 

(i) Show that is an element of ¥ q for all a e ¥ q m . 

(ii) Show that the map 

Tr F?m /F? : ¥ q m F ? , a i-> Tr F9m/F< (a) 

is an F 9 -linear transformation, where both ¥ q m and F ? are viewed as 
vector spaces over F 9 . 

(iii) Show that is surjective. 

(iv) Let fi e ¥ q m. Prove that Tr F ,,,/p (Ji) = 0 if and only if there exists 
an element y e ¥ q m such that fi = y q — y. (Note: this statement 
is commonly referred to as the additive form of Hilbert’s Theorem 
90.) 

(v) (Transitivity of trace.) Prove that 


Tr F ^ m / F? (o!) = Tr F?m / F< (Tr F? ,„/ F9m (a)) 


for any a e F 9 ™ . 

4.6 (a) Let V be a vector space over a finite field F ? . Show that (Au+/rv)-w = 
A(u • w) + /r(v • w), for all u, v, w e V and A, /r e ¥ q . 

(b) Give an example of a finite field F ? and a vector u defined over ¥ q 
with the property that u / 0 but u • u = 0. 
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(c) Let V be a vector space over a finite field V q and let {vi , V 2 , . . . , v*} be 
a basis of V . Show that the following two statements are equivalent: 

(i) y • v' = 0 for all v, v' e V, 

(ii) v, • \j — 0 for all i, j e { 1 , 2 ,..., k}. 

(Note: this shows that it suffices to check (ii) when we need to deter- 
mine whether a given linear code is self-orthogonal.) 

4.7 Let F q be a finite field and let S be a subset of F" . 

(i) Show that S x and < S > x are subspaces of F" . 

(ii) Show that S x =<S > x . 

4.8 For each of the following sets S and corresponding finite fields F ? , find 
the F ? -linear span < S > and its orthogonal complement S x : 

(a) S = {101, 111, 010}, q = 2, 

(b) S = {1020, 0201, 2001}, q = 3, 

(c) S = {00101, 10001, 11011}, = 2. 

Problems 4.9 to 4.13 deal with some well known inner products other than 
the Euclidean inner product. 

4.9 Let (, }h : F" 2 x F^ 2 -»• V q 2 be defined as 

(U, V) H = 

i = 1 

where u = («i, . . . , u n ), v = (iq, . . . , v n ) e F^ 2 . Show that (, }h is an 
inner product on F^ 2 . (Note: this inner product is called the Hermitian 
inner product. For a linear code C over F q 2 , its Hermitian dual is defined 
as 


C Xh = {v e F" 2 : (v, c)h = 0 for all c e C}. 

If C = C ±H , then we say C is self-dual with respect to the Hermitian 
inner product.) 

4.10 Write F 4 = {0, 1, a, a 2 } (cf. Example 3.3.5). Show that the following 
linear codes over F 4 are self-dual with respect to the Hermitian inner 
product: 

(a) C 1 = {(0, 0), (1, 1), (a, a), (a 2 , a 2 )}; 

(b) Ci is the F 4 -linear code with generator matrix 

( 1 0 0 1 a a\ 

0 1 0 a 1 a j . 

0 0 1 a a 1 / 

(Note: the code C 2 is called the hexacode .) 

Are C 1 and C 2 self-dual with respect to the Euclidean inner product? 
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4.1 1 Let (, > s : F 2 q n x F 2n -> V q be defined as 

((u, v), (u', v'))s = u • v' - V • u', 

where n,v,n',v'eFJ and • is the Euclidean inner product on F". Show 
that (, }s is an inner product on F 2n . (Note: this inner product is called 
the symplectic inner product. It is useful in the construction of quantum 
error-correcting codes.) 

4.12 For (u, v) e F 2 q , where u = (ki, . . . , u n ) and v = (t>i, . . . , v n ), the 
symplectic weight wts((u, v)) of (u, v) is defined to be the number of 
\ <i < n such that at least one of u , , u, is nonzero. Show that 

iwt((u, v)) < Wts((u, v)) < wt((u, v)), 
where wt((u, v)) denotes the usual Hamming weight of (u, v). 

4.13 Let C be a linear code over F q with a generator matrix (7„| A), where /„ 
is the n x n identity matrix and A is an n x n matrix satisfying A = A T . 

(i) Show that C is self-dual with respect to the symplectic inner product 
(, ) s , i.e., C = C ±s , where 

C is = {veF^ : (v, c) s = 0 for all c e C}. 

(ii) Show that C is equivalent to C x , its dual under the usual Euclidean 
inner product. 

4. 14 Determine which of the following codes are linear over Y q : 

(a) q = 2andC = {1101, 1110, 1011, 1111}, 

(b) q = 3 and C = {0000, 1001, 0110, 2002, 1111, 0220, 1221, 2112, 
2222 }, 

(c) q = 2 andC = { 00000 , 11110 , 01111 , 10001 }. 

4.15 Let C and D be linear codes over F q of the same length. Define 

C + D = {c + d : c e C, d e D}. 

Show that C + D is a linear code and that ( C + D) x = C x n D L . 

4.16 Determine whether each of the following statements is true or false. 
Justify your answer. 

(a) If C and D are linear codes over F 9 of the same length, then C fl D 
is also a linear code over F ? . 

(b) If C and D are linear codes over F q of the same length, then CUD 
is also a linear code over F ? . 

(c) If C =< S >, where S = {vi, \ 2 , V 3 } C F”, then dim(C) = 3. 

(d) If C =< S >, where S = {vi, V 2 , V 3 } c F”, then 

d{C) = min{wt(vi), wt(v2>, wt(v3)}. 

(e) If C and D are linear codes over F ? with C c D, then D x c C x . 
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4.17 Determine the number of binary linear codes with parameters [n,n — 1,2] 
for n > 2. 

4. 18 Prove Lemma 4.3.6. 

4.19 Let u e Fj. A binary code C of length n is said to correct the error 

pattern u if and only if, for all c, c 7 e C withe 7 ^ c, wehaveaf(c, c+u) < 
d( c', c + u). Assume that Ui , 112 £ agree in at least the positions where 

1 occurs in m. Suppose that C corrects the error pattern 112. Prove that 
C also corrects the error pattern ui . 

4.20 (i) Let x,y e FJ. If x and y are both of even weight or both of odd 

weight, show that x + y must have even weight. 

(ii) Let xjeFj. If exactly one of x, y has even weight and the other 
has odd weight, show that x + y must have odd weight. 

(iii) Using (i) and (ii), or otherwise, prove that, for a binary linear code 
C, either all the codewords have even weight or exactly half of the 
codewords have even weight. 

4.21 Let C be a binary linear code of parameters [n, k. d]. Assume that C 
has at least one codeword of odd weight. Let C' denote the subset of C 
consisting of all the codewords of even weight. Show that C' is a binary 
linear code of parameters [n,k — 1 , d'], with d' > d if d is odd, and 
d' = d if d is even. (Note: this is an example of an expurgated code.) 

4.22 (a) Show that every codeword in a self-orthogonal binary code has even 

weight. 

(b) Show that the weight of every codeword in a self-orthogonal ternary 
code is divisible by 3. 

(c) Construct a self-orthogonal code over F5 such that at least one of its 
codewords has weight not divisible by 5. 

(d) Let x, y be codewords in a self-orthogonal binary code. Suppose the 
weights of x and y are both divisible by 4. Show that the weight of 
x + y is also a multiple of 4. 

4.23 Let C be a self-dual binary code with parameters [n, k, d], 

(i) Show that the all-one vector (1, 1, . . . , 1) is in C. 

(ii) Show that either all the codewords in C have weight divisible by 
4; or exactly half of the codewords in C have weight divisible by 4 
while the other half have even weight not divisible by 4. 

(iii) Let n = 6. Determine d. 

4.24 Give a parity-check matrix for a self-dual binary code of length 10. 

4.25 Prove that there is no self-dual binary code of parameters [10, 5, 4], 

4.26 For n odd, let C be a self-orthogonal binary \n, (n — l)/2]-code. Let 1 
denote the all-one vector of length n and let 1 + C — [1 + c : c £ C}. 
Show that C i =CU(l + C). 
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4.27 Let C be a linear code over F ? of length n. For any given i with 1 < i < n, 
show that either the i th position of every codeword of C is 0 or every 
element aeF, appears in the i th position of exactly 1 / q of the codewords 
of C. 

4.28 Let C be a linear code over F (/ of parameters [n, k, d] and suppose that, 
for every 1 < i <n, there is at least one codeword whose i th position is 
nonzero. 

(i) Show that the sum of the weights of all the codewords in C is 
n(q - 1 )q k ~ l . 

(ii) Show thatd < n(q - 1 )q k ~ 1 /(q k - 1). 

(iii) Show that there cannot be a binary linear code of parameters 
[15, 7, d] withd > 8. 

4.29 Let x, y be two linearly independent vectors in F" and let z denote the 
number of coordinates where x, y are both 0. 

(i) Show that wt(y) + wt (x + Ay) = q(n — z). 

(ii) Suppose further that x, y are contained in an [n,k, <rf]-code C over 
F ? . Show that wt(x) + wt(y) < qn — (q — 1 )d. 

4.30 Let C be an [n. k, d]-code over F ? , where gcd (d, q) = 1. Suppose that 
all the codewords of C have weight congruent to 0 or d modulo q. 

(i) Ifx, y are linearly independent codewords such that wt(x) = wt(y) = 
0 (mod q), show that wt(x + Xy) = 0 (mod q) for all leF,. (Hint: 
use Exercise 4.29.) 

(ii) Show that Co = {c e C : wt(c) = 0 (mod q)\ is a linear subcode 
of C ; i.e., Co is a linear code contained in C . 

(iii) Show that C cannot have a linear subcode of dimension 2 all of 
whose nonzero codewords have weight congruent to d (mod q). 
Hence, deduce that Co has dimension k — 1. 

(iv) Given a generator matrix Go for Co and a codeword veCof weight 
d, show that 



is a generator matrix for C. 

4.31 Find a generator matrix and a parity-check matrix for the linear code 
generated by each of the following sets, and give the parameters [n,k,d] 
for each of these codes: 

(a) q = 2, S = {1000, 0110, 0010, 0001, 1001}, 

(b) q = 3, S = {110000, 011000, 001100, 000110, 000011}, 

(c) q = 2, S ={10101010,11001100,11110000,01100110,00111100}. 



Exercises 


71 


4.32 Assign messages to the words in F 2 as follows: 

000 100 010 001 110 101 Oil 111 

A C D E G I N O 


Let C be the binary linear code with generator matrix 

/ 10101 \ 

G = j 01010 J . 

V 00011 / 

Use G to encode the message encoding. 

4.33 Find a generator matrix G' in standard form for a binary linear code 
equivalent to the binary linear code with the given generator matrix G : 


(a) G = 


( 1 0 1 0 1 0\ 
0 10 10 1 
110 110 
0 0 10 11/ 


( 1 0 1 1 0 0 1 1 1 \ 
0 0 0 1 0 1 1 0 0 j 
0 0 0 1 0 1 1 1 0 / 


4.34 Find a generator matrix G' in standard form for a binary linear code 
C' equivalent to the binary linear code C with the given parity-check 
matrix H : 

/I 1 0 0 0 \ /0 1 0 1 1 1 0 \ 

(a) H = j 0 1 1 0 1 J , (b) H = j 1 1 1 1 0 0 0 j . 

\0 0 0 1 1 / \0 110 10 1 / 

4.35 Construct a binary code C of length 6 as follows: for every (x\ , x 2 , X 3 ) e 

F 2 , construct a 6-bit word (x\ , x%, X 3 , X 4 , x$, x&) e C, where 

X 4 = Xi +X2+X3, 

X5 = Xi +X3, 

X6 = X2+X3. 

(i) Show that C is a linear code. 

(ii) Find a generator matrix and a parity-check matrix for C. 

4.36 Construct a binary code C of length 8 as follows: for every ( a , b, c, d ) e 
F 2 , construct an 8-bit word (a, b, c, d, w, x, y, z) e C, where 


w = a + b + c, 
x = a + b + d, 
y = a + c + d, 
z = b + c + d. 

(i) Show that C is a linear code. 

(ii) Find a generator matrix and a parity-check matrix for C. 
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(iii) Show that C is exactly three-error-detecting and one-error- 
correcting. 

(iv) Show that C is self-dual. 

4.37 (a) Prove that equivalent linear codes always have the same length, di- 

mension and distance. 

(b) Show that, if C and C' are equivalent, then so are their duals C x 
and (C 7 )" 1 - 

4.38 Suppose that an (« — k) x n matrix H is a parity-check matrix for a linear 
code C over V q . Show that, if M is an invertible (n — k)x(n— k) matrix 
with entries in V q , then MH is also a parity-check matrix for C. 

4.39 Find the distance of the binary linear code C with each of the following 
given parity-check matrices: 


/ Oil 1000 \ 


/1 101000 \ 

1110100 

, (b )H = 

1010100 

1100010 

0110010 

V 1010001 ) 


\ 1 100001/ 


4.40 Let n > 4 and let H be a parity-check matrix for a binary linear code C 
of length n. Suppose that the columns of H are all distinct and that the 
weight of every column of H is odd. Show that the distance of C is at 
least 4. 

4.41 List the cosets of each of the following q- ary linear codes: 

(a) q = 3 and C 3 = {0000, 1010, 2020, 0101, 0202, 1111, 1212, 2121, 
2222}, 

(b) q = 2 and C 2 = {00000,10001,01010,11011,00100,10101, 

oino, mil}. 

4.42 Let H denote the parity-check matrix of a linear code C . Show that the 
coset of C whose syndrome is v contains a vector of weight t if and only 
if v is equal to some linear combination of t columns of H. 

4.43 For m, n satisfying 2 m_1 < n < 2 m , let C be the binary [n , n — m]-code 
whose parity-check matrix H has as its / th column (!</<«) the binary 
representation of i (i.e., the first column is (0 . . . 01) T , the second column 
is (0 . . . 010) T and the third column is (0 . . . 01 1) T , etc.). Show that every 
coset of C contains a vector of weight <2. 

4.44 Let C c FJ be a linear code with distance d. Show that a word x e F ( " is 
the unique coset leader of x + C if wt(x) < [_{d — 1)/2J . 

4.45 Let C be a linear code of distance d, where d is even. Show that some 
coset of C contains two vectors of weight e + 1, where e = l(d — I ) /2J . 
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4.46 Show that 

/ 1020 \ 

\ 0102 ) 

is a parity-check matrix for C3 in Exercise 4.41 and that 

/ 10001 \ 

V01010 ) 

is a parity-check matrix for C2 in Exercise 4 . 41 . Using these parity-check 
matrices and assuming complete decoding, construct a syndrome look-up 
table for each of C3 and C 2. 

4.47 Let C be the binary linear code with parity-check matrix 

n 10100 \ 

h = 101010 . 

V 011001 ) 

Write down a generator matrix for C and list all the codewords in C. 
Decode the following words: 

(a) 110110, (b) 011011, (c) 101010. 

4.48 Let p be a prime and let f denote a primitive pth root of unity in C, the 
field of complex numbers (i.e., l; p = 1 but f ^ 1 for all 0 < f < p). Let 
/ be a function defined on F" p such that the values /(v), where v e F" , can 
be added and subtracted, and multiplied naturally by complex numbers. 
Define the discrete Fourier transform f of / as follows: 

/(u) = £ /(vR u ' v , 

veFJ 

where u • v is the Euclidean inner product in V" p . Let C be a linear code 
of length n over F p and, for v e F", define 

C,(v) = {u e C : u ■ v = i], for 0 < i < p - 1 . 

(i) Show that, for 1 < i < p — 1 , C,(v) is a coset of C 0 (v) in C 
if and only if v <f C 1 . Show also that, if v <f C x , then C = 
C 0 (v) U Ci(v) U • • • U C p _i(v). 

(ii) Show that 


Ef 


\C\ if v e C x , 
0 ifv^C x . 


(iii) Show that /( w) = — £ /( uR uw , where weF 

p" 



74 


Linear codes 


4.49 


(iv) Show that ^ /(v) = ^ /(u). 

veC- 1 - I u eC 

Let C be a linear code of length n over F p , where p is a prime. The 
( Hamming ) weight enumerator of C is the homogeneous polynomial 


W c (x,y)= J2x n - v ' m y v,t(u \ 

u eC 


By setting /(u) = x n wt (u)-ywt(u) j n Exercise 4.48, or otherwise, show that 

W c ±{x, y ) = W c (x + (p- l)y, x - y). 

|L | 

(Note: this identity is called the MacWilliams identity. It actually holds 
for all finite fields V q , with p replaced by q in the above, though the proof 
is slightly more complicated.) 



5 Bounds in coding theory 


Given a q- ary (n, M, d)- code, where n is fixed, the size M is a measure of the 
efficiency of the code, and the distance d is an indication of its error-correcting 
capability. It would be nice if both M and d could be as large as possible, but, as 
we shall see shortly in this chapter, this is not quite possible, and a compromise 
needs to be struck. 

For given q, n and d, we shall discuss some well known upper and lower 
bounds for the largest possible value of M. In the case where M is actually 
equal to one of the well known bounds, interesting codes such as perfect codes 
and MDS codes are obtained. We also discuss certain properties and examples 
of some of these fascinating families. 


5.1 The main coding theory problem 

Let C be a q -ary code with parameters (n, M , d). Recall from Chapter 2 that the 
information rate (or transmission rate) of C is defined to be 1Z(C) = (log 9 M)/n. 
We also introduce here the notion of the relative minimum distance. 


Definition 5.1.1 For a q- ary code C with parameters (n. M, d), the relative 
minimum distance of C is defined to be 8(C) = (d — 1 )/n. 

Remark 5.1.2 The relative minimum distance of C is often defined to be d/n 
in the literature, but defining it as (d — \)/n leads sometimes to neater formulas 
(see Remark 5.4.4). 


Example 5.1.3 (i) Consider the q- ary code C = K". It is easy to see that 
(n, M, d) = (n, q n . 1) or, alternatively, [n, k, d] = [«, «, 1], Hence, 


11(C) = 


toggO?") 

n 


8(C) = 0. 


1 , 


75 



76 


Bounds in coding theory 


This code has the maximum possible information rate, while its relative 
minimum distance is 0. As the minimum distance of a code is related closely 
to its error-correcting capability (cf. Theorem 2.5.10), a low relative minimum 
distance implies a relatively low error-correcting capability. 

(ii) Consider the binary repetition code 


C = {00- -0, 11-1}. 


Clearly, (n, M, d ) = (n, 2, n) or, equivalently, C is a binary [n, 1, «]-linear 
code. Hence, 


11(C) = 


log 2 (2) 

n 



n 


8(C) = 



1 , 


as n — > oo. As this code has the largest possible relative minimum distance, 
it has excellent error-correcting potential. However, this is achieved at the cost 
of very low efficiency, as reflected in the low information rate. 

(iii) There is a family of binary linear codes (called Hamming codes - see 
Section 5.3.1) with parameters (n, M, d) = (2 r — 1, 2" -r , 3) or, equivalently, 
[n, k, d] = [2 r — 1,2' — 1 — r, 3], for all integers r > 2. When r — > oo, we 
have 


11(C) 

8(C) 


log 2 (2''-') 


n 



n 


2 r - 1 - r 
2 ' ■ - 1 


-* 1 , 


Again, while this family of codes has good information rates asymptotically, 
the relative minimum distances tend to zero, implying asymptotically bad error- 
correcting capabilities. 


The previous examples should make it clear that a compromise between the 
transmission rate and the quality of error-correction is necessary. 


Definition 5.1.4 For a given code alphabet A of size q (with q > 1) and given 
values of n and d, let A q (n,d) denote the largest possible size M for which 
there exists an (n, M, af)-code over A. Thus, 

A q (n, d) = max{M : there exists an (n. M . r/)-code over A}. 

Any (n, M, d)- code C that has the maximum size, that is, for which M — 
A q (n, d), is called an optimal code. 
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Remark 5.1.5 (i) Note that A q (n, d) depends only on the size of A, n and d. 
It is independent of A. 

(ii) The numbers A q (n,d) play a central role in coding theory, and much 
effort has been made in determining their values. In fact, the problem of 
determining the values of A q {n, d) is sometimes known as the main coding 
theory problem. 

Instead of considering all codes, we may restrict ourselves to linear codes 
and obtain the following definition: 

Definition 5.1.6 For a given prime power q and given values of n and d, let 
B q {n, d) denote the largest possible size q k for which there exists an [n , k , d]- 
code over F (/ . Thus, 

B q (n, d) = max{^ s : there exists an [n, k, af]-code over F ? J. 

While it is, in general, rather difficult to determine the exact values of 
A q {n,d) and B q (n, d), there are some properties that afford easy proofs. 

Theorem 5.1.7 Let q >2 be a prime power. Then 

(i) B q {n,d) < A q (n,d) < q" for all 1 <d <n\ 

(ii) B q (n, 1) = A q (n, 1) = q n \ 

(iii) B q (n, n) = A q (n, n) = q. 

Proof. The first inequality in (i) is obvious from the definitions, while the 
second one is clear since any ( n,M , <7)-code over ¥ q , being a nonempty subset 
of ¥ n q , must have M < q n . 

To show (ii), note that FJJ is an [n, n, l]-linear code, and hence an (n, q n , 1)- 
code, overF ? , so q n < B q (n, 1) < q n \ i.e., B q (n, 1) = A q {n, 1) = q n . 

For (iii), let C be an (n, M, n)- code over ¥ q . Since the codewords are of 
length n, and the distance between two distinct codewords is > n, it follows 
that the distance between two distinct codewords is actually n. This means that 
two distinct codewords must differ at all the coordinates. Therefore, at each 
coordinate, all the M words must take different values, so M < q, implying 
B q (n, n) < A q (n, n) < q. The repetition code of length n, i.e., {(a, a, ... , a) : 
a e F 9 }, is an [«, 1, «]-linear code, and hence an («, q, «)-code, over F ? , so 
B q (n, n) = A q (n, n) = q. □ 

In the case of binary codes, there are additional elementary results on 
Ai{n,d) and Bo(n, d). Before we discuss them, we need to introduce the 
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notion of the extended code, which is a useful concept in its own right. For 
a binary linear code, its extended code is obtained by adding a parity-check 
coordinate. This idea can be generalized to codes over any finite field. 

Definition 5.1.8 For any code C over F ? , the extended code of C, denoted by 
C, is defined to be 

W=\^u...,c n ,-±c^J : (ci,...,c„)eC 

When q = 2, the extra coordinate — YH= t c < = X7= t c i added to the codeword 
(ci, . . . , c„) is called the parity-check coordinate. 

Theorem 5.1.9 IfC is an(n, M, d)-code over¥ q , then C is an (n+\. M , de- 
code over F q , with d < d' < d + 1. If C is linear, then so is C. Moreover, 
when C is linear, 



is a parity-check matrix ofC if H is a parity-check matrix ofC. 

The proof is straightforward, so it is left to the reader (Exercise 5.3). 

Example 5.1.10 (i) Consider the binary linear code C i = {000, 110,011, 
101}. It has parameters [3, 2, 2], The extended code 

H : = {0000, 1100,0110, 1010} 

is a binary [4, 2, 2]-linear code. 

(ii) Consider the binary linear code Ci = {000, 111,011, 100}. It has 
parameters [3,2, 1], The extended code 

C^={0000, 1111,0110, 1001} 
is a binary [4, 2, 2]-linear code. 

This example shows that the minimum distance d(C) can achieve both d (C ) 
and d(C) + 1. Example 5.1.10(ii) is an illustration of the following fact. 

Theorem 5.1.11 Suppose d is odd. 

(i) Then a binary (n, M , d)-code exists if and only if a binary (n + 1, M , d+l)- 
code exists. Therefore, ifd is odd, A 2(0 + 1, d + 1) = A 2 ( 0 , d). 
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(ii) Similarly, a binary [ n,k , d]-linear code exists if and only if a binary [n + 1 , 
k,d + 1 ]-l inear code exists, so 82(0 + 1, d + 1) = B 2 (n, d). 

Proof. For (i), the latter statement follows immediately from the previous one, 
so we only prove the earlier statement. 

Suppose that there is a binary (n, M, d)-code C , where d is odd. Then, from 
Theorem 5.1.9, C is an (n + 1, M, d')- code with d < d' < d + 1. 

Note that wt(x') is even for all x' e C. Therefore, Lemma 4.3.5 and 
Corollary 4.3.4 show that d(x',y') is even for all x', y' e C, so d' is even. 
Since d is odd and d <d'<d+ 1, it follows that d' = d + 1. 

We have therefore shown that, if there is a binary (n, M, <rf)-code C, then C 
is a binary (n + 1, M, d + l)-code. 

Next, we suppose that there exists a binary (n + 1, M, d + l)-code D, where 
d is odd. Choose codewords x and y in D such that d(x, y) = d + 1. In 
other words, x and y differ at d + 1 >2 coordinates. Choose a coordinate 
where x and y differ, and let D’ be the code obtained by deleting this coordinate 
from all the codewords of D. (The code D' is called a punctured code; see 
Theorem 6.1.1(iii).) Then D' is a binary (n, M, d)- code. 

For (ii), it suffices to observe that, in the proof of (i), if C is linear, then so 
is C ; similarly, if D is linear, then so is D' . □ 


Remark 5.1.12 The last statement in Theorem 5.1.11 (i) is equivalent to ‘if d 
is even, then A 2 ( 0 , d) = A 2(0 — 1, d — 1)’. There is also an analogue for (ii). 

While the determination of the exact values of A q (n, d) and B q (n, d) can 
be rather difficult, several well known bounds, both upper and lower ones, do 
exist. We shall discuss some of them in the following sections. 

A list of lower bounds and, in some cases, exact values for A 2 OZ, d) may 
be found at the following webpage maintained by Simon Litsyn of Tel Aviv 
University: 


http://www.eng.tau.ac.il/4itsyn/tableand/index.html. 

The following website, maintained by Andries E. Brouwer of Technische 
Universiteit Eindhoven, contains tables that give the best known bounds (upper 
and lower) on the distance d for q -ary linear codes (q < 9) of given length and 
dimension: 


http://www.win.tue.nl/~aeb/voorlincod.html. 
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5.2 Lower bounds 

We discuss two well known lower bounds: the sphere-covering bound (for 
A q (n, d)) and the Gilbert- Varshamov bound (for B q {n, d)). 

5.2.1 Sphere-covering bound 

Definition 5.2.1 Let A be an alphabet of size q, where q > 1. For any vector 
u e A" and any integer r > 0, the sphere of radius r and centre u, denoted 
S^fu, r), is the set {v e A n : d(u, v) < r}. 

Definition 5.2.2 For a given integer q > 1, a positive integer n and an integer 
r > 0, define V q (r) to be 

v * (r) = I (o) + (")(* - ')+ G)(<? - l) 2 + • • • + C )(q - l) r if 0 <r<n 
\q n if n<r. 

Lemma 5.2.3 For all integers r > 0, a sphere of radius r in A n contains exactly 
V"(r) vectors, where A is an alphabet of size q > 1. 

Proof. Fix a vector u e A". We determine the number of vectors v e A" such 
thatriiu, v) = m; i.e., the number of vectors in A n of distance exactly m fromu. 
The number of ways in which to choose the m coordinates where v differs from 
u is given by ("J. For each coordinate, we have q — 1 choices for that coordinate 
in v. Therefore, the total number of vectors of distance m from u is given by 
( n ){q — 1 ) m . For 0 <r < n, Lemma 5.2.3 now follows. 

Whenr > n, note that .SAlu, r) = A ", hence it contains V q (r) = q n vectors. 

□ 


We are now ready to state and prove the sphere-covering bound. 


Theorem 5.2.4 (Sphere-covering bound.) For an integer q > 1 and integers 
n, d such that 1 < d < n, we have 


q" 

Eto (")(<? - V? 


< A q (n, d). 


Proof. Let C — {ci, C 2 , . . . , Cm] be an optimal («, M, d )-code over A with 
| A | = q, so M — A q {n,d). Since C has the maximum size, there can be no 
word in A n whose distance from every codeword in C is at least d. If there 
were such a word, we could simply include it in C, and thereby obtain an 
(«, M + 1, ri)-code. 
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Therefore, for every vector x in A n , there is at least one codeword c, in C 
such that d (x, c ( ) is at most d — 1 ; i.e., x e S A ( c, , d — I ). Hence, every word 
in A n is contained in at least one of the spheres 5 a(c, , d — 1). In other words, 

M 

A n c 1). 

i=i 

(For this reason, we say that the spheres Sa(c, , d — 1) (1 < i < M) cover A n , 
hence the name ‘sphere-covering’ bound.) 

Since \A n \ = q" and ,S'a(c, , d — 1)| = V q (d — 1) for any i, we have 

q n < M ■ VJV - 1), 


implying that 


VHd - 1) 


< M = A q (n, d). 


□ 


Some examples of the lower bounds for A q (n,d) given by the sphere- 
covering bound are found in Tables 5. 2-5.4 (see Example 5.5.5). 

The following example illustrates how A q (n, d) may be found in some spe- 
cial cases. In the example, the lower bound is given by the sphere-covering 
bound. Then a combinatorial argument shows that the lower bound must also 
be an upper bound for A q (n,d), hence yielding the exact value of A q (n, d). 


Example 5.2.5 We prove that A 2 (5, 4) = 2. 

The sphere-covering bound shows that A 2 (5, 4) > 2. 

By Theorem 5.1.11, we see that 42(5, 4) = 42(4, 3), so we next show that 
42(4, 3) < 2. Let C be a binary (4, M, 3)-code and let (xi,x 2 , x 2 , X 4 ) be a 
codeword in C. Since d{C) = 3, the other codewords in C must be of the 
following forms: 

{x\,xi, xl, m), (xi, x 2 , TJ, m), (xT, x^,x 3 ,m), 

(xT, xi , xj, x 4 ) , (xl, xi , xj, m) , 

where xj is defined by 

_ [ 1 if x, = 0 

_ jo if x, = 1 . 

However, no pair of these five words are of distance 3 (or more) apart, and 
so only one of them can be included in C. Hence, M < 2, implying that 
A 2 ( 4, 3) < 2. Therefore, A 2 (5, 4) = A 2 (4, 3) = 2. 
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5.2.2 Gilbert-Varshamov bound 

The Gilbert-Varshamov bound is a lower bound for B q {n, d ) (i.e., for linear 
codes) known since the 1950s. There is also an asymptotic version of the 
Gilbert-Varshamov bound, which concerns infinite sequences of codes whose 
lengths tend to infinity. However, we shall not discuss this asymptotic result 
here. The interested reader may refer to Chap. 17, Theorem 30 of ref. [13]. 
For a long time, the asymptotic Gilbert-Varshamov bound was the best lower 
bound known to be attainable by an infinite family of linear codes, so it became 
a sort of benchmark forjudging the ‘goodness’ of an infinite sequence of linear 
codes. Between 1977 and 1982, V. D. Goppa constructed algebraic-geometry 
codes using algebraic curves over finite fields with many rational points. A 
major breakthrough in coding theory was achieved shortly after these discov- 
eries, when it was shown that there are sequences of algebraic-geometry codes 
that perform better than the asymptotic Gilbert-Varshamov bound for certain 
sufficiently large q. 


Theorem 5.2.6 (Gilbert-Varshamov bound.) Let n, k and d be integers satis- 
fying 2 < d < n and 1 < k < n. If 

E(”~ ! )(? - !)' < <7"“*’ (5.D 

then there exists an [n,k]-linear code over F ? with minimum distance at 
least d. 


Proof. We shall show that, if (5.1) holds, then there exists an (n — k) x n matrix 
H over F 9 such that every d — I columns of H are linearly independent. 

We construct H as follows. Let Cj denote the yth column of H. 

Let Ci be any nonzero vector in F” _ *. Let C 2 be any vector not in the span 
of Ci. For any 2 < j < n, let c ; be any vector that is not in the linear span of 
d — 2 (or fewer) of the vectors ci, . . . , C/_i. 

Note that the number of vectors in the linear span of d — 2 or fewer of 
Ci, ... , Cj-i (2 < j < n) is given by 


1C 


Hence, the vector c j (2 < j < n) can always be found. 

The matrix H constructed in this manner is an (n — k) x n matrix, and any 
d — 1 of its columns are linearly independent. The null space of H is a linear 
code over F ? of length n, of distance at least d, and of dimension at least k. 
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By turning to a ^-dimensional subspace, we obtain a linear code of the desired 
type. □ 


Corollary 5.2.7 For a prime power q > 1 and integers n , d such that 2 < d < 
n, we have 

B q (n, d) > ? «-nog ? (vr‘w- 2 ) + i) 1 > ^ . 

? “ Vr\d - 2 ) 


Proof. Put 


k = n-\\og q {v;-\d-2) + \y\. 

Then (5.1) is satisfied and thus there exists a q- ary [n, k, (L (-linear code with 
d\ > d by Theorem 5.2.6. By changing certain d\ — d fixed coordinates to 0, 
we obtain a < 7 -ary [n, k, af]-linear code (see also Theorem 6.1.1(iv)). Our result 
follows from the fact that B q (n,d) > q k . □ 


5.3 Hamming bound and perfect codes 

The first upper bound for A q (n, d) that we will discuss is the Hamming bound, 
also known as the sphere-packing bound. 


Theorem 5.3.1 (Hamming or sphere-packing bound.) For an integer q > 1 
and integers n, d such that 1 < d < n, we have 


A q (n,d) < 




Proof. Let C = {ci, C 2 , . . . , c M \ be an optimal (n, M, d)-code over A (with 
|4J = q), so M — A q (n,d). Let e = \_(d — 1)/2J ; then the packing spheres 
S A (Ci , e) are disjoint. Hence, we have 


M 

U^(c,-,e)c A n , 


where the union on the left hand side is a disjoint union. Since \A n \ = q" and 
|5a(c,-, e)| = V”(e) for any i, we have 

M ■ V q n (e) < q\ 


implying that 


A q (n, d) = M < 


V q ”(e) 


g" 

Vq(\_(d — 1)/2J) 


This completes the proof. 


□ 
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Definition 5.3.2 A <7 -ary code that attains the Hamming (or sphere-packing) 
bound, i.e., one which has q" / (")(q — 1 )') codewords, is called a 
perfect code. 

Some of the earliest known codes, such as the Hamming codes and the Golay 
codes, are perfect codes. 

5 . 3.1 Binary Hamming codes 

Hamming codes were discovered by R. W. Hamming and M. J. E. Golay. They 
form an important class of codes - they have interesting properties and are easy 
to encode and decode. 

While Hamming codes are defined over all finite fields F ? , we begin by 
discussing specifically the binary Hamming codes. These codes form a special 
case of the general < 7 -ary Hamming codes, but because they can be described 
more simply than the general < 7 -ary Hamming codes, and because they are 
arguably the most interesting Hamming codes, it is worthwhile discussing them 
separately from the other Hamming codes. 

Definition 5.3.3 Let r > 2. A binary linear code of length « = 2' — 1, with 
parity-check matrix H whose columns consist of all the nonzero vectors of F' 2 , 
is called a binary Hamming code of length 2' — 1. It is denoted by Ham(r, 2). 

Remark 5.3.4 (i) The order of the columns of H has not been fixed in 
Definition 5.3.3. Hence, for each r >2, the binary Hamming code Ham(r, 2) 
is only well defined up to equivalence of codes. 

(ii) Note that the rows of H are linearly independent since H contains all 
the r columns of weight 1 words. Hence, H is indeed a parity-check matrix. 

Example 5.3.5 Ham(3, 2): A Hamming code of length 7 with a parity-check 
matrix 

( 0 0 0 1 1 1 1 \ 

0 1 1 0 0 it). 

10 10 10 1 / 

Proposition 5.3.6 (Properties of the binary Hamming codes.) 

(i) All the binary Hamming codes of a given length are equivalent. 

(ii) The dimension o/Ham(r, 2) is k = 2 r — 1 — r. 

(iii) The distance of Ham(r, 2) is d = 3, hence Ham(r, 2) is exactly single- 
error-correcting. 

(iv) Binary Hamming codes are perfect codes. 



5.3 Hamming bound and perfect codes 


85 


Proof, (i) For a given length, any parity-check matrix can be obtained from 
another by a permutation of the columns, hence the corresponding binary Ham- 
ming codes are equivalent. 

(ii) Since H, a parity-check matrix for Ham(r, 2), is an r x (2 r — 1) matrix, 
the dimension of Ham(r, 2) is 2 r — 1 — r . 

(iii) Since no two columns of H are equal, any two columns of H are 
linearly independent. On the other hand, H contains the columns (100 .. . 0) T , 
(010 .. . 0) T and (110... 0) T , which form a linearly dependent set. Hence, by 
Corollary 4.5.7, the distance of Ham(r, 2) is equal to 3. It then follows from 
Theorem 2.5.10 that Ham(r, 2) is single-error-correcting. 

(iv) It can be verified easily that Ham(r, 2) satisfies the Hamming bound and 

is hence a perfect code. □ 

Decoding with a binary Hamming code 

Since Ham(r, 2) is perfect single-error-correcting, the coset leaders 
are precisely the 2 r (= n + 1) vectors of length n of weight < 1. Let 
e ; denote the vector with 1 in the y'th coordinate and 0 elsewhere. 

Then the syndrome of e ; is just e ; // T , i.e., the transpose of the j th 
column of H. 

Hence, if the columns of H are arranged in the order of increas- 
ing binary numbers (i.e., the y'th column of H is just the binary 
representation of j ; see Exercise 4.43), the decoding is given by: 

Step 1: When w is received, calculate its syndrome S(yv) — w H T . 

Step 2: If S(w) = 0, assume w was the codeword sent. 

Step 3: If S(w) / 0, then S(w) is the binary representation of j, for 
some 1 < / < 2 r — 1. Assuming a single error, the word 
e ; - gives the error, so we take the sent word to be w — e ; (or, 
equivalently, w + e 7 ). 

Example 5.3.7 We construct a syndrome look-up table for the Hamming code 
given in Example 5.3.5, and use it to decode w = 1001001 (see Table 5.1). 

The syndrome is w// T = 010, which gives the coset leader e .2 = 0100000. 
We can then decode w as w — t 2 = w + e 2 = 1101001. 


Definition 5.3.8 The dual of the binary Hamming code Ham(r, 2) is called a 
binary simplex code. It is sometimes denoted by S (r, 2). 

Some of the properties of the simplex codes are contained in Exercise 5.19. 
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Table 5.1. 


Coset leader u 

Syndrome S(u) 

0000000 

000 

1000000 

001 

0100000 

010 

0010000 

011 

0001000 

100 

0000100 

101 

0000010 

110 

0000001 

111 


Definition 5.3.9 The extended binary Hamming code, denoted Ham(r, 2), is 
the code obtained from Ham(r, 2) by adding a parity-check coordinate. 

Proposition 5.3.10 (Properties of the extended binary Hamming codes.) 

(i) Ham(r, 2) is a binary [2 r , 2 r — 1 — r, 4]-linear code. 

(ii) A parity-check matrix H for Ham(r, 2) is 

0 

H : 

0 

1- - 1 T 

where H is a parity-check matrix for Ham(r, 2). 

Proposition 5.3.10 follows immediately from Theorem 5.1.9 and the proof 
of Theorem 5.1.11. 

Remark 5.3.11 The rate of transmission for Ham(r, 2) is slower than that of 
Ham(r, 2), but the extended code is better suited for incomplete decoding. 

Example 5.3.12 Let r = 3 and take 

0 0 0 1 1 1 1 0 \ 

0 110 0 110 

10 10 10 10 ' 

11111111 / 

Note that every codeword is made up of 8 bits and recall that the syndrome of the 
error vector e ; is just the transpose of the y'th column of H. Assuming that as 
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few errors as possible have occurred, the incomplete decoding works as follows. 
Suppose the received vector is w, so its syndrome is S(w) = w H . Suppose 
it is S(w) = (,S| , S'2, S 3 , s 4 ). Then S(w) must fall into one of the following four 
categories: 

(i) s 4 = 0 and (,s'i , S 2 , s 3) = 0. In this case, S(w) — 0, so w e Ham(3, 2). We 
may therefore assume that there are no errors. 

(ii) s 4 = 0 and (si, S 2 , S 3 ) / 0. Since S(w) ^ 0, at least one error must have 
occurred. If exactly one error occurs and it occurs in the y'th bit, then the 
error vector is e 7 , so S' (w) = 5(e 7 ), which is the transpose of the y'th column 
of H. An inspection of H shows immediately that the last coordinate (the 
one corresponding to s 4 ) of every column is 1 , contradicting the fact that 
S4 — 0. Hence, the assumption that exactly one error has occurred is 
flawed, and we may assume at least two errors have occurred and seek 
retransmission. 

(iii) s 4 = 1 and (v 1 , S 2 , 53 ) = 0. Again, since S(w) ^ 0, at least one error has 
occurred. It is easy to see that S(w) = 5 (eg), so we may assume a single 
error in the last coordinate, i.e., the parity-check coordinate. 

(iv) s 4 = 1 and (,Si , S 2 , S 3 ) ^ 0. As before, it is easy to check that S(w) must 
coincide with the transpose of one of the first seven columns of H, say the 
yth column. Hence, S(yv) = ,S(e ; ), and we may assume a single error in 
the j th coordinate. In fact, given the way the columns of H are arranged, 
j is the number whose binary representation is (si, S2, S3). 

5.3.2 qr-ary Hamming codes 

Let q > 2 be any prime power. Note that any nonzero vector veF' generates 
a subspace < v > of dimension 1. Furthermore, for v, w e F^\{0), < v >= 
< w > if and only if there is a nonzero scalar X e F ? \{0} such that v = Xw. 
Therefore, there are exactly ( q r — 1 )/{q — 1) distinct subspaces of dimension 

1 in F;. 


Definition 5.3.13 Let r > 2. A q - ary linear code, whose parity-check ma- 
trix H has the property that the columns of H are made up of precisely one 
nonzero vector from each vector subspace of dimension 1 of ¥ r q , is called a 
q-ary Hamming code, often denoted as Ham(r, q). 

It is an easy exercise to show that, when q = 2, the code defined here is the 
same as the binary Hamming code defined earlier. 
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Remark 5.3.14 An easy way to write down a parity-check matrix for Ham(r, q) 
is to list as columns all the nonzero r -tuples in whose first nonzero entry 
is 1. 

Proposition 5.3.15 (Properties of the q-aiy Hamming codes.) 

(i) Ham(r , q) is a [{q r - 1 )/(q - 1), ( q r - 1 )/(<? - 1) - r, 3 ]-code. 

(ii) Ham(r, q) is a perfect exactly single-error-correcting code. 

The proof of Proposition 5.3.15 resembles that of Proposition 5.3.6, so we 
leave it as an exercise to the reader (Exercise 5.17). 

Decoding with a q- ary Hamming code 

Since Ham(r, q) is a perfect single-error-correcting code, the coset 
leaders, other than 0, are exactly the vectors of weight 1. A typical 
coset leader is then denoted by e,-^ ( 1 < j < n, b e F 9 \{0}) - the 
vector whose j th coordinate is b and the other coordinates are 0. Note 
that 

S(e jtb ) = be], 

where c j denotes the yth column of H . 

Decoding works as follows: 

Step 1: Given a received word w, calculate S(w) = w// T . 

Step 2: If S(yv) — 0, then assume no errors. 

Step 3 : If S (w) f 0, then find the unique tj b such that S (w) = S (ej,b)- 
The received word is then taken to be w — e 7 ^. 

Definition 5.3.16 The dual of the q- ary Hamming code Ham(r, q) is called a 
q- ary simplex code. It is sometimes denoted by S (r, q). 

The reader may refer to Exercise 5 . 1 9 for some of the properties of the q -ary 
simplex codes. 


5.3.3 Golay codes 

The Golay codes were discovered by M. J. E. Golay in the late 1940s. The 
(unextended) Golay codes are examples of perfect codes. It turns out that the 
Golay codes are essentially unique in the sense that binary or ternary codes 
with the same parameters as them can be shown to be equivalent to them. 
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Binary Golay codes 

Definition 5.3.17 Let G be the 12 x 24 matrix 
G = (In\A), 

where In is the 12 x 12 identity matrix and A is the 12 x 12 matrix 
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The binary linear code with generator matrix G is called the extended binary 
Golay code and will be denoted by G 24 . 

Remark 5.3.18 (i) The Voyager 1 and 2 spacecraft were launched towards 
Jupiter and Saturn in 1977. This code was used in the encoding and decoding 
of the general science and engineering (GSE) data for the missions. 

(ii) It is also common to call any code that is equivalent to the linear code 
with generator matrix G an extended binary Golay code. 

Proposition 5.3.19 (Properties of the extended binary Golay code.) 

(i) The length of G 24 is 24 and its dimension is 12. 

(ii) A parity-check matrix for G 24 is the 12 x 24 matrix 

H = ( A\I n )• 

(iii) The code G 24 is self-dual, i.e., G ^ 4 = G 24 . 

(iv) Another parity-check matrix for G 24 is the 12 x 24 matrix 

H' = ( / 12 I A)(=G ). 

(v) Another generator matrix for G 24 is the 12 x 24 matrix 


G' = ( A\I 12 )(=H ). 



90 


Bounds in coding theory 


(vi) The weight of every codeword in G 24 is a multiple of A. 

(vii) The code G 24 has no codeword of weight 4, so the distance of G 24 is 
d — 8. 

(viii) The code G 24 is an exactly three-error-correcting code. 


Proof, (i) This is clear from the definition. 

(ii) This follows from Theorem 4.5.9. 

(iii) Note that the rows of G are orthogonal; i.e., if r, and r ; are any two 
rows of G, then r, • r 7 = 0. This implies that G 24 c G\ 4 . On the other hand, 
since both G 24 and G \ 4 have dimension 12, we must have G 24 = G\ 4 . 

(iv) A parity-check matrix of G 24 is a generator matrix of G \ 4 = G24, and 
G is one such matrix. 

(v) A generator matrix of G24 is a parity-check matrix of G 24 = G 24, and 
H is one such matrix. 

(vi) Let v be a codeword in G24. We want to show that wt(v) is a multiple 
of 4. Note that v is a linear combination of the rows of G. Let r, denote the ith 
row of G. 

First, suppose v is one of the rows of G. Since the rows of G have weight 8 
or 12, the weight of v is a multiple of 4. 

Next, let v be the sum v = r, + r ; of two different rows of G. Since G24 is 
self-dual. Exercise 4.22(d) shows that the weight of v is divisible by 4. 

We then continue by induction to finish the proof. 

(vii) Note that the last row of G is a codeword of weight 8. This fact, together 
with statement (vi) of this proposition, implies that d — 4 or 8. 

Suppose G24 contains a nonzero codeword v with wt(v) = 4. Write v as 
(vi, V2), where vi is the vector (of length 12) made up of the first 12 coordinates 
of v, and V2 is the vector (also of length 12) made up of the last 12 coordinates 
of v. Then one of the following situations must occur: 

Case (1) wt(vi) = 0 and wt(v2) = 4. This cannot possibly happen since, 
by looking at the generator matrix G, the only such word is 0 , which is of 
weight 0. 

Case (2) wt(vi) = 1 and wt(v2) = 3. In this case, again by looking at G, 
v must be one of the rows of G, which is again a contradiction. 

Case (3) wt(vO = 2 and wt(v2) = 2. Then v is the sum of two of the rows 
of G. It is easy to check that none of such sums would give wt(v2) = 2. 

Case (4) wt(vi) = 3 and wt(v2) = 1. Since G' is a generator matrix, v must 
be one of the rows of G', which clearly gives a contradiction. 

Case (5) wt(vi) = 4 and wt(v2) = 0. This case is similar to case (1), using 
G' instead of G. 
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Since we obtain contradictions in all these cases, d — 4 is impossible. Thus, 
d = 8. 

(viii) This follows from statement (vii) above and Theorem 2.5. 10. □ 

Definition 5.3.20 Let G be the 12 x 23 matrix 
G = (/i 2 |A), 

where In is the 12 x 12 identity matrix and A is the 12 x 11 matrix obtained 
from the matrix A by deleting the last column of A. The binary linear code 
with generator matrix G is called the binary Golay code and will be denoted 
by Gn. 

Remark 5.3.21 Alternatively, the binary Golay code can be defined as the code 
obtained from G 24 by deleting the last coordinate of every codeword. 

Proposition 5.3.22 (Properties of the binary Golay code.) 

(i) The length of G 23 is 23 and its dimension is 12. 

(ii) A parity-check matrix for G 23 is the 11 x 23 matrix 

H = (A t |/„). 

(iii) The extended code of G 23 is G 24. 

(iv) The distance of G 73 is d = 7. 

(v) The code G23 is a perfect exactly three-error-correcting code. 

The proof is left as an exercise to the reader (see Exercise 5.24). 

Ternary Golay codes 


Definition 5.3.23 The extended ternary Golay code, denoted by G 12 , is the 
ternary linear code with generator matrix G = UfB), where B is the 6 x 6 
matrix 
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Remark 5.3.24 Any linear code that is equivalent to the above code is also 
called an extended ternary Golay code. 
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By mimicking the method used in Proposition 5.3.19, it is possible to check 
that G 12 is a self-dual ternary [12, 6, 6]-code (see Exercise 5.28). 

Definition 5.3.25 The ternary Golay code Gn is the code obtained by punc- 
turing Gn in the last coordinate. 

One can verify that Gn satisfies the Hamming bound and is hence a perfect 
ternary [11, 6, 5]-code (see Exercise 5.29). 

5.3.4 Some remarks on perfect codes 

The following codes are obviously perfect codes and are called trivial perfect 
codes : 

(i) the linear code C — F” (d = 1); 

(ii) any C with |C| = 1 (d = oo); 

(iii) binary repetition codes of odd lengths consisting of two codewords at 
distance n from each other (d = n). 

In the earlier subsections, we have seen that the Hamming codes and the 
Golay codes are examples of nontrivial perfect codes. Various constructions of 
nonlinear perfect codes with the same parameters as the q- ary Hamming codes 
have also been found. 

In fact, the following result is true. 

Theorem 5.3.26 (Van Lint and Tietavainen.) When q >2 is a prime power, a 
nontrivial perfect code over ¥ q must have the same parameters as one of the 
Hamming or Golay codes. 

This result was obtained by Tietavainen [22, 23] with considerable contri- 
bution from van Lint [12], A proof may be found in Chap. 6 of ref. [13]. This 
result was also independently proved by Zinov’ev and Leont’ev [25], 


5.4 Singleton bound and MDS codes 

In this section, we discuss an upper bound for A q (n,d) due to Singleton [20], 

Theorem 5.4.1 (Singleton bound.) For any integer q > 1, any positive integer 
n and any integer d such that 1 <d < n, we have 

A q (n, d) < q n ~ d+1 . 
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In particular, when q is a prime power, the parameters [n, k, d] of any linear 
code over F ? satisfy 

k + d < n + 1. 

Proof. We first note that the final statement of Theorem 5.4.1 follows from the 
previous one since, by definition of A q (n,d), q k < A q (n, d). 

To prove that A q (n, d) < q"~ d+l , consider an (n, M, <rf )-code C over an 
alphabet A of size q, where M = A q (n,d). Delete the last d — I coordinates 
from all the codewords of C. Since the distance of C is d, after deleting the 
last d — 1 coordinates from all the codewords, the remaining words (of length 
n — d + 1) are still all distinct. The maximum number of words of length 
n — d + 1 is q n ~ d+1 , so A q (n, d)= M < q n ~ d+l . □ 

Remark 5.4.2 The following is another easy direct proof for the inequality 
k + d < n + 1 in the case of an [n, k, af]-linear code C: 

Given any parity-check matrix H for C, the row rank, and hence the rank, 
of H is, by definition, n — k. Therefore, any n — k + 1 columns of H form a 
linearly dependent set. By Theorem 4.5.6(ii), d < n — k + 1. 

Definition 5.4.3 A linear code with parameters [n, k, d] such that k+d = n+ 1 
is called a maximum distance separable (MDS) code. 

Remark 5.4.4 An alternative way to state the Singleton bound is: for any q-ary 
code C, we have 


11(C) + 8(C) < 1 . 

(In this situation, we see that our choice of the definition of the relative minimum 
distance 8(C) gives a neater inequality than if 5(C) is defined to be d/n.) A 
linear code C is MDS if and only if 1Z(C) + 5(C) = 1. 

One of the interesting properties of MDS codes is the following. 

Theorem 5.4.5 Let C be a linear code over F 9 with parameters [ n,k,d ]. Let 
G, H be a generator matrix and a parity-check matrix, respectively, for C . 
Then, the following statements are equivalent: 

(i) C is an MDS code; 

(ii) every set ofn — k columns of H is linearly independent; 

(iii) every set ofk columns of G is linearly independent; 

(iv) C 1 - is an MDS code. 
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Proof. The equivalence of (i) and (ii) follows directly from Corollary 4.5.7, 
with d = n — k + 1. 

Since G is a parity-check matrix for C x , (iii) and (iv) are also equivalent by 
Corollary 4.5.7. 

Next, we prove that (i) implies (iv). 

Recall that H is a generator matrix for C x , so the length of C x is n and 
the dimension is n — k. To show that C x is MDS, we need to show that the 
minimum distance d' is k + 1. 

Suppose d' < k. Then there is a word c e C 1 with at most k nonzero 
entries (and hence at least n — k zero coordinates). Permuting the coordinates 
does not change the weight of the words, so we may assume that the last n — k 
coordinates of c are 0. 

Write H as H = (A|//'), where A is some (n — k) x k matrix and H' 
is a square (n — k) x (n — k) matrix. Since the columns of H’ are linearly 
independent (for (i) and (ii) are equivalent), H' is invertible. Hence, the rows 
of H' are linearly independent. The only way to obtain 0 in all the last n—k 
coordinates (such as for c) is to use the 0-linear combination of the rows of H' 
(by linear independence). Therefore, the entire word c is the all-zero word 0. 
Consequently, d' >k + 1 . Together with the Singleton bound, it now follows 
that d' = k + 1. 

Since (C x ) x = C, the above also shows that (iv) implies (i). This completes 
the proof of the theorem. □ 


Definition 5.4.6 An MDS code C over V q is trivial if and only if C satisfies 
one of the following: 

(i) C = F"; 

(ii) C is equivalent to the code generated by 1 = (1, . . . , 1); or 

(iii) C is equivalent to the dual of the code generated by 1. 

Otherwise, C is said to be nontrivial. 


Remark 5.4.7 When q = 2, the only MDS codes are the trivial ones. This 
fact follows easily by considering the generator matrix in standard form (see 
Exercise 5.32). 

An interesting family of examples of MDS codes is given by the (generalized) 
Reed-Solomon codes. For more details, see Chapters 8 and 9. Some other 
examples may also be found in the exercises at the end of this chapter. 
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5.5 Plotkin bound 

The next upper bound for A q (n, d) that we will discuss is the Plotkin bound, 
which holds for codes for which d is large relative to n. It often gives a tighter 
upper bound than many of the other upper bounds, though it is only applicable to 
a comparatively smaller range of values of d. The proof we give for the Plotkin 
bound makes use of the following well known Cauchy-Schwarz inequality. 


Lemma 5.5.1 (Cauchy-Schwarz inequality.) Let {ai, ... , a „, ) and {b \, ... , 
b m ) be any two sets of real numbers. Then 



Consequently, 



E bl s ) - E - «a) 2 / 2 




For more details on the Cauchy-Schwarz inequality, see, for example, ref. 

[9]. 


Theorem 5.5.2 (Plotkin bound.) Let q > 1 be an integer and suppose that 
n, d satisfy rn < d, where r = 1 — q~ l . Then, 

A >' d) A^n[ 

Proof. Let C be an (n, M, af)-code over an alphabet A of size q. Let 

T = YY di c ’ c ')- 

ceC tteC 

Since d < d( c, c') for c, c ' e C such that c / c', it follows that 

M(M - 1 )d < T. ( 5 . 2 ) 

Now let A be the M x n array whose rows are made up of the M codewords 
in C. For 1 < i < n and a & A, let n l a denote the number of entries in the i th 
column of A that are equal to a. Hence, = M for every 1 < i < n. 

Consequently, writing c = (ci , . . . , c„) and c' = {c\ , . . . , c' n ), we have 

T = E (e E = E E n ^ M - = m2 « - EIX- 

i=l \ceC c'eC / i=l aeA ;=1 aeA 
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Applying Lemma 5.5.1, with m = q and a\ — ■ ■ ■ — a q 


1, it follows that 


T < M 2 n - (5.3) 

1=1 \aeA / 

The Plotkin bound now follows from (5.2) and (5.3). □ 

In fact, when q = 2, a more refined version of the Plotkin bound is available. 


Theorem 5.5.3 (Plotkin bound for binary codes.) 
(i) When d is even, 


A 2 (n, d) < 


2ld/(2d-n)\ 

Ad 


for n < 2d 
for n = 2d. 


(ii) When d is odd, 


\2\fd + l)/(2 d + 1 - n)J for n < 2d + 1 
{Ad + 4 for n = 2d + l. 


We leave the proof of Theorem 5.5.3 as an exercise (Exercise 5.30). 


Example 5.5.4 To illustrate that Theorem 5.5.3 gives a more refined bound 
than Theorem 5.5.2, note that Theorem 5.5.2 gives A 2 (8, 5) < 5, A 2 (8, 6) < 3, 
A 2 (12, 7) < 7 and A 2 (ll, 8) < 3, whereas Theorem 5.5.3 gives A 2 (8, 5) < 4, 
A 2 (8, 6) < 2, A 2 (12, 7) < 4 and A 2 (ll, 8) < 2. 

Example 5.5.5 In Tables 5.2-5.4, we list the sphere-covering lower bound and 
compare the Hamming, Singleton and Plotkin upper bounds for A 2 («, d), with 
d = 3, 5, 7 and d < n < 12. In cases where the Plotkin bound is not applicable, 
the entry is marked 


5.6 Nonlinear codes 

Whereas most of this book focuses on linear codes, there are several families of 
(binary) nonlinear codes that are well known and important in coding theory. 
We provide a brief introduction to some of them in this section. 
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Table 5.2. Bounds for A 2 {n, 3). 
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Table 5.3. Bounds for A 2 (n, 5). 
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Table 5.4. Bounds for A 2 (n, 7). 
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5.6.1 Hadamard matrix codes 

Definition 5 . 6.1 A Hadamard matrix H n is an n x n integer matrix whose 
entries are 1 or —1 and which satisfies H n Hj = nl„, where I„ is the identity 
matrix. 


When such a Hadamard matrix exists, then either n — 1 , 2 or n is a multiple 
of 4. The existence of Hadamard matrices is known for many n; for example 
when n is a power of 2 (these are called Sylvester matrices), and when n = 
p m + 1, where p is a prime and n is divisible by 4 (this is called the Paley 
construction). The construction of Sylvester matrices is easy. We begin with 
Hi = (1) and use the observation that, whenever H n is a Hadamard matrix of 
order n , the matrix 


H ln = 



is a Hadamard matrix of order 2 n. 

The existence of a Hadamard matrix H n implies the existence of binary 
nonlinear codes of the following parameters: 


(n, 2{d/{2d — n)\,d) 
{ 2 d, Ad, d) 
{n,2\_{d + \)/{2d + \ - n)\,d) 
(2 d+ 1 , 4<i + 4, d) 


for d even and d < n < 2d; 
for d even; 

for d odd and d < n < 2d + 1 ; 
for d odd. 


These codes were constructed by Levenshtein [10]. By the Plotkin bound, they 
are optimal. 


5.6.2 Nordstrom-Robinson code 

It can be shown that there cannot be any binary linear codes of parameters 
[16, 8, 6] (see Exercise 5.35). However, there does exist a binary nonlinear 
code, called the Nordstrom-Robinson code, of parameters (16, 2 8 , 6). It was 
discovered by Nordstrom and Robinson [16] (when Nordstrom was still a high 
school student!) and later independently by Semakov and Zinov’ev [19]. One 
construction of this famous code is as follows. 

Rearrange the columns of the extended binary Golay code so that the new 
code (also called G 24) contains the word 1111111 10 - • • 0, and let G denote a 
generator matrix for this new G24. Since d(G 24) = 8 > 7, Theorem 4.5.6(i) 
shows that the first seven columns of G are linearly independent. One can 
then show that each of the 2 7 possible vectors in appears as the first seven 
coordinates of some codeword in G24. In fact, each of them appears in exactly 
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2 12 /2 7 = 32 codewords of G 24 . Now collect all those words in G 24 whose first 
seven coordinates are either all 0 or are made up of six Os and one 1. There are 
altogether 8 x 32 = 256 = 2 8 of them. 

The Nordstrom-Robinson code is obtained by deleting the first eight coor- 
dinates from these 2 8 vectors. It can be shown that this code has minimum 
distance 6 and is nonlinear. 


5.6.3 Preparata codes 

For m > 2, Preparata codes are binary nonlinear codes with the parameters 

( 2 2m , 2 22m_4m , 6 ). 

There are several different ways to construct the Preparata codes; one way 
is as follows. 

Write the vectors of F f'" in the form (u, v), where u, v e Ff'" 1 . Label the 
coordinate positions of these vectors in Ff" 1 by the elements of F 2 2m-i, with 
the first coordinate position corresponding to 0. For a e F 2 2 »-i , denote the 
entry at the ath coordinate of u, v by u a , v a , respectively. 

Definition 5.6.2 For m > 2 , the Preparata code P{m) of length 2 lm con- 
sists of all the codewords (u, v), where u, v e F 22 ” , satisfying the following 
conditions: 

(i) both u and v are of even Hamming weight; 

(ii) £«„=i « = £*=1 «; 

(iii) £«„-i « 3 + (£»„=) «) 3 = £„„=! « 3 - 

It can be shown that P(m ) is a subcode of the extended binary Hamming 
code of the same length (see Chap. 15 of ref. [13] or Sect. 9.4 of ref. [24]). 

The first code in this family, with m = 2, can be shown to be equivalent to 
the Nordstrom-Robinson (16, 2 8 , 6)-code in Section 5.6.2. 

5.6.4 Kerdock codes 

Form > 2, the Kerdock codes K (m) are binary nonlinear codes with parameters 
2 4m 2 2m ~ l 2 m ~ l ) 

The Kerdock code K(m ) is constructed as a union of 2 2 "' -1 cosets of the 
Reed-Muller code 72.(1, 2m) in 72(2, 2m) (see Section 6.2). 

Once again, the first code in this family, with m = 2, is equivalent to the 
Nordstrom-Robinson code. The Kerdock codes form a special case of a more 
general family of nonlinear codes called the Delsarte-Goethals codes. 



100 


Bounds in coding theory 


The weight enumerators of the Kerdock and Preparata codes can be shown 
to satisfy the Mac Williams identity (see Exercise 4.49), thus giving a ‘formal 
duality’ between the Kerdock and Preparata codes. However, this falls beyond 
the scope of this book, so we will not elaborate further on this formal duality. 
The interested reader may refer to Chap. 15, Theorem 24 of ref. [13] for more 
details. This mystery of the formal duality between the Kerdock and Preparata 
codes was explained when it was shown by Nechaev [15] and Hammons et al. 
[7] that the Kerdock codes can be viewed as linear codes over the ring Z4, and 
by Hammons et al. [7] that the binary images of the Z4-dual of the Kerdock 
codes over Z4 can be regarded as variants of the Preparata codes. 


5.7 Griesmer bound 

The next bound we shall discuss is the Griesmer bound, which applies specifi- 
cally to linear codes. 

Let C be a linear code over F ? with parameters [«, k ] and suppose c is a 
codeword in C with wt(c) = w. 

Definition 5.7.1 The support of c, denoted by Supp(c), is the set of coordinates 
at which c is nonzero. 

Definition 5.7.2 The residual code of C with respect to c, denoted Res(C , c), is 
the code of length n — w obtained from C by puncturing on all the coordinates 
of Supp(c). 

Note that w — |Supp(c)|. 

Lemma 5.7.3 If C is an [ n , k, d]-code over F ? and c e C is a codeword of 
weightd, then Res(C, c) is an [n — d, k — 1, d']-code, where d' > [ d/q~\. Here, 
pc] is the least integer greater than or equal to x. 

Proof. Without loss of generality, we may replace C by an equivalent code so 
that c = (1, 1 , . . . , 1 , 0 , 0 , . . . , 0), where the first d coordinates are 1 and the 
other coordinates are 0. 

We first note that Res(C , c) has dimension at most k— 1 . To see this, observe 
first that Res(C, c) is a linear code. For every x e F", denote by x' the vector 
obtained from x by deleting the first d coordinates, i.e., by puncturing on the 
coordinates of Supp(c). Now, it is easy to see that the map C — > Res(C, c) 
given byx i-> x' is a well defined surjective linear transformation of vector 
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spaces, whose kernel contains c and is hence a subspace of C of dimension at 
least 1. Therefore, Res(C, c) has dimension at most k — 1. 

We shall show that Res(C, c) has dimension exactly k — 1. 

Suppose that the dimension is strictly less than k — 1 . Then there is a nonzero 
codeword v = (tq, iq, . . . , v„) in C that is not a multiple of c and that has the 
property that v^+i = •••=■ v n = 0. Then v — iqc is a nonzero codeword that 
belongs to C and that has weight strictly less than d, contradicting the definition 
of d. Hence, Res(C, c) has dimension k — 1. 

To show that d' > \d /q\ , let fe+i, . . . , x „ ) be any nonzero codeword of 
Res(C, c), and let x = (jq, . . . , Xd, x<j+ i , . . . , x n ) be a corresponding word in 
C. By the pigeonhole principle, there is an a e F q such that at least d/q 
coordinates of (jq , . . . ,xj) are equal to a. Hence, 

d < wt(x — ac) < d — - + wt((jCd+i, . . . , x n )). 

<7 

The inequality d' > [d/q] now follows. □ 


Theorem 5.7.4 (Griesmer bound.) Let C be a q-ary code of parameters 
[n, k, d\, where k > 1. Then 


n > 



Proof. We prove the Griesmer bound by induction on k. Clearly, when k = 1, 
Theorem 5.7.4 holds. 

When k > 1 and c e C is a codeword of minimum weight d, then 
Lemma 5.7.3 shows that Res(C,c) is an [n — d,k — l,d']-code, where 
d' > \d/q]. By the inductive hypothesis, we may assume that the Griesmer 
bound holds for Res(C, c), hence 


n~d>J2 >E 


Theorem 5.7.4 now follows. 


□ 


Example 5.7.5 From Exercise 5.19, the q - ary simplex code S(r,q) has 
parameters [( q' — 1 )/(q — 1), r, q r ~ l ], so it meets the Griesmer bound. 
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5.8 Linear programming bound 

One of the best bounds in existence for A q (n, d) is one that is based on linear 
programming techniques. It is due to Delsarte [2], The bound obtained through 
this method is often called the linear programming bound. 

A family of polynomials, called the Krawtchouk polynomials, plays a pivotal 
role in this theory. Krawtchouk polynomials are also very useful in other areas 
of coding theory. We give the definition and summarize some properties of 
these polynomials below. 

Definition 5.8.1 For a given q , the Krawtchouk polynomial K k (x\ n) is defined 
to be 

K k (x 

When there is no ambiguity for n, the notation is often simplified to K k {x). 
Proposition 5.8.2 (Properties of Krawtchouk polynomials.) 

(i) If z is a variable, then Ya°=o K k(x)z k = (1 + (q - l)z)"“*(l - if. 

(ii) K k ( X ) = eW-^v )'% ;•). 

(iii) K k (x) is a polynomial of degree k, with leading coefficient (—q) k /k\ and 

constant term K k { 0 ) = (q — \ ) k . 

(iv) (Orthogonality relations.) E"=o (")(<? “ K k (i)K t (i) = 

(q — 1 ) k q" , where S kk is the Kronecker delta function; i.e., 

1 ifk = l 
0 otherwise. 

(v) (q - 1) ! '(")K,(0 =(q- 1 f^Kfk). 

(vi) E"=o K e (i)K i (k) = 8 u q". 

(vii) £*=o (" n Z k j) K k(x) = q^iy)- 

(viii) When q=2,we have KfxWfx) = J2Lo 
(ix) Every polynomial f (x ) of degree r can be expressed as f(x) = 
ELo fkKk(x), where f k = q~ n E"= 0 fd)K l (k). ( This way of express- 
ing f(x) is called the Krawtchouk expansion of fix).) 

We leave the proof of Proposition 5.8.2 to the reader (see Exercise 5.42). 
The linear programming bound gives an upper bound for A q (n, d)\ i.e., it 
applies also to nonlinear codes. Therefore, we will deal with the distance 
between two distinct codewords and not the weight of each codeword. For the 
main result in this section, we need the following notion. 
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Definition 5.8.3 Let A be an alphabet of size q. For C an (n, A/)-code over A 
and for all 0 < i < n, let 

A,(C)= i l{(U,V)eC XC : J(u ’ v) = /}1 ’ 

The sequence {d i (C )}” =0 is called the distance distribution of C. 

Remark 5.8.4 Note that the distance distribution depends only on the size 
q of the code alphabet and not on the alphabet itself. To obtain the linear 
programming bound, it is more convenient to work with the ring Z q as the 
alphabet. Hence, in the discussion below, while we begin with codes over an 
alphabet A of size q, we pass immediately to codes over Z q in the proofs. 

Lemma 5.8.5 Let C be a q-ary code of length n. Then 

J2MC)K k (i)>0 

i= 0 

for all integers 0 < k < n. 

Proof. As mentioned in Remark 5.8.4, we assume C is defined over Z q . 
It suffices to show that M J ]"= 0 Aj(C)K k (i) > 0, where M = |C|. Using 
Exercise 5.46, 

M'jr Ai (c)K k (i) = J2 £ £ ? (u - v>w = £ £r w >0, 

1=0 i=0 (u,v)eC 2 weZ" weZ" ueC 

d(u,v)=; wt(w)=£ wt(w)=£ 

where, foru = (u \, . . . , u„)andw = (iui, . . . , tu„),u-w = u\W\-\ \-u n w n , 

and ^ is a primitive ^th root of unity in C; i.e., t, q = I but f f i for all 
0 < i < q. □ 

Theorem 5.8.6 (Linear programming bound - version 1.) For a given integer 
q > 1 and positive integers n and d (1 < d <n), we have 

A q {n, d) < max j £^ /\, : Aq = l, A, = 0 for 1 < i < d. A, > 0 for 0 < i < n 

[ i=0 

£ AiK k (i) > 0 for 0 < k < n J . (5.4) 

Proof. LetM = A q (n,d). IfC is a < 7 -ary («, M)-code, its distance distribution 
{A,(C )}" =0 satisfies the following conditions: 


(i) A 0 (C) = 1: 



104 


Bounds in coding theory 


(ii) Aj(C) = 0 for 1 < i < d; 

(iii) A,(C) > 0 for all 0 < i < n; 

(iv) E"=o Ai(C)K k (i) > 0 for 0 < k < n (from Lemma 5.8.5); 

(v) M = A q (rt, d) = YH=o A i(C). 

Hence, the inequality (5.4) follows immediately. □ 

The following theorem is the duality theorem of Theorem 5.8.6 in linear pro- 
gramming. It is often more useful than Theorem 5.8.6 because any polynomial 
fix) that satisfies Theorem 5.8.7 gives an upper bound for A q (n, d ), while an 
optimal solution for the linear programming problem in (5.4) is required to give 
an upper bound for A q (n, d ). 

Theorem 5.8.7 (Linear programming bound - version 2.) Let q > l be an 
integer. For positive integers n and d ( 1 < d < n), let fix) = 1 + 
Ylk= l fk^k( x ) be a polynomial such that A >0(1 < k < n) and f(i) < 0 for 
d <i <n. Then A q {n,d) < /( 0). 

Proof. As in the proof of Theorem 5.8.6, let M = A q (n, d), let C be a < 7 -ary 
(n, M )- code and let {A, (C )}" =0 be its distance distribution. 

Note that conditions (i), (ii) and (iv) in the proof of Theorem 5.8.6 imply that 
K t (0) > — J2'i=d Aj(C)K k (i) for all 0 < k < n. The condition that /(/) < 0 
for <7 < i < n implies that YH=d -A,-(C) /(*') < 0 , which means that 

/( 0) = 1 + f k K k ( 0) 

k=l 

A,(C)K k (i) 

k= 1 i=d 

= 1 -J2A,(C)J2f k Kk(i) 

i=d k= 1 

= I - ^ AfC f fii) - 1) 

i=d 

> 1 + J2mo 

i=d 

— M = A q (n,d). 

□ 

To illustrate that the linear programming bound can be better than some 
other bounds that we have discussed in this chapter, we show in Example 5.8.8 
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how one can deduce the Singleton bound, the Hamming bound and the Plotkin 
bound from the linear programming bound. 


Example 5.8.8 (i) (Singleton bound.) Let 


/ W =,— n(i-' 7 ). 

By Proposition 5.8.2(ix), / (x ) = Ylk=o fkKk(x), where fk is given by 


f k = 




n-k 
d- 1 


> 0 , 


where the last equality follows from Proposition 5.8.2(vii). In particular, 
/o = 1. Clearly, f(i) = 0 for d < i < n. 

Hence, by Theorem 5.8.7, it follows that A q (n, d) < /( 0) = q n ~ d+1 , which 
is the Singleton bound (cf. Theorem 5.4.1). 

(ii) (Hamming bound.) Let d = 2e + 1. Let fix) = fk^k(x), where 

f k = J L e (k) j - I)' Q J (0 < k < n ), 

with L e (x) = Y^i=o Ki( x ) — K e (x— l;n— 1). (The polynomial L e (x) is called 
a. Lloyd polynomial.) Clearly, fk > 0 for all 0 < k < n and / 0 = 1. Using 
Proposition 5.8.2(viii) and (vi), it can be shown that /(/) = 0 for d < i < n. 
Therefore, Theorem 5.8.7 and Proposition 5.8.2(iv) show that 


A q {n, d) < /( 0) = q n j ^fq - 1)' 


which is exactly the Hamming bound. 

(iii) (Plotkin bound for A 2 ( 21 + 1, l + 1).) Set q = 2, n = 21 + 1 and 
d = l + 1. Take f\ = (£ + l)/(2£ + 1) and f 2 = l/(2£ + 1), so that 

fM =' + ^ K ' M+ 2lTl KM 

p I 1 l 

= 1 + 2 ^( 2 * + 1 - 2 *) + ^ j(2x 2 - 2(2 e + l)x + 1(21 + 1 )). 
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Clearly, fk > 0 for all 1 < k < n, and it is straightforward to verify that 
/(/) < 0 for t + 1 = d < i < n = 21 + 1. (In fact, f(x) is a quadratic 
polynomial such that f(t + 1) = 0 = f(2l + 1).) 

Hence, by Theorem 5.8.7, it follows that 

A 2 (2£ + 1, l + 1) < /(0) = 1 + ^^-(2£ + 1) + ^y£(2£ + 1) = U + 2, 

which is exactly the Plotkin bound (cf. Theorem 5.5.2). (Note: when £ is even, 
Theorem 5.5.3 in fact gives a better bound.) 


Exercises 

5. 1 Find the size, (minimum) distance, information rate and relative minimum 
distance of each of the following codes: 

(a) the binary code of all the words of length 3; 

(b) the ternary code consisting of all the words of length 4 whose second 
and fourth coordinates are 0; 

(c) the code over the alphabet F ; , ( p prime) consisting of all the words of 
length 3 whose first coordinate is p — 1 and whose second coordinate 
is 1; 

(d) the repetition code over the alphabet F p (p prime) consisting of the 
following words of length n : (0,0,...,0),(l,l,...,l),...,(p — 

5.2 For n odd, let C be a self-orthogonal binary [n, (n — l)/2]-code. Show 
that C x is a self-dual code. (Note: compare with Exercise 4.26.) 

5.3 For any code C over F ? and any e e F*, let 

C e = j^ci,...,c„,e^c,) : (ci, . . . , c„) € C J . 

(In particular, C_i is the extended code C of C defined in Definition 
5.1.8.) 

(i) If C is an («, M, d)-code, show that C € is an (« + 1, M, af' )-code, 
where d < d' < d + 1. 

(ii) If C is linear, show that C f is linear also. Find a parity-check matrix 
for C e in terms of a parity-check matrix H of C. 

5.4 Without using any of the bounds discussed in this chapter, show that 

(a) A 2 ( 6, 5) = 2, (b) A 2 (7, 5) = 2. 

(Hint: For (a), first show that 42(6, 5) > 2 by producing a code explicitly. 
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Then try to show that /M6, 5) < 2 using a simple combinatorial argument 
similar to the one in Example 5.2.5.) 

5.5 Find an optimal binary code with n = 3 and d = 2. 

5.6 Prove that A q (n, d ) < qA q {n — 1 ,d). 

5.7 For each of the following spheres in A " = F!J, list its elements and 
compute its volume: 

(a) S A (1 10, 4), (b) 5^(1 100, 3), (c) S A (10101, 2). 

5.8 For each n such that 4 < n < 12, compute the Hamming bound and the 
sphere-covering bound for A 2 (n, 4). 

5.9 Prove that a (6, 20, 4)-code over F7 cannot be an optimal code. 

5.10 Let q > 2 and n > 2 be any integers. Show that A q (n, 2) = q n ~ l . 

5.11 Let C be an [n, k, d]-code over F ? , where gcd(d, q) = 1. Suppose that 
all the codewords of C have weight congruent to 0 or d modulo q. Using 
Exercise 4.30(iv), or otherwise, show the existence of an [n + 1, k, d+ 1]- 
code over F ? . 

5.12 Let C be an optimal code over F 1 1 of length 1 2 and minimum distance 2. 
Show that C must have a transmission rate of at least 5/6. 

5.13 For positive integers n. M . d and q > 1 (with 1 < d < n), show that, 

if (M - DEto'C)^ - < 9". then there exists a q- ary («, M)- 

code of minimum distance at least d. (Note: this is often known as the 
Gilbert-Varshamov bound for nonlinear codes.) 

5.14 Determine whether each of the following codes exists. Justify your 
answer. 

(a) A binary code with parameters (8, 29, 3). 

(b) A binary linear code with parameters (8, 8, 5). 

(c) A binary linear code with parameters (8, 5, 5). 

(d) A binary linear code with parameters (24, 2 12 , 8). 

(e) A perfect binary linear code with parameters (63, 2 57 , 3). 

5.15 Write down a parity-check matrix H for a binary Hamming code of length 
15, where the yth column of H is the binary representation of j . Then 
use H to construct a syndrome look-up table and use it to decode the 
following words: 

(a) 01010 01010 01000, 

(b) 11100 01110 00111, 

(c) 11001 11001 11000. 

5.16 (i) Show that there exist no binary linear codes with parameters [2 m , 2 m — 

m, 3], for any m > 2. 

(ii) Let C be a binary linear code with parameters [2 m ,k, 4], for some 
m > 2. Show that k < 2 m — m — 1. 

5.17 Prove Proposition 5.3.15. 



108 


Bounds in coding theory 


5.18 (i) Let n > 3 be an integer. Show that there is an [n, k. 3] -code defined 

over F ? if and only if q n ~ k — 1 > (q — 1 )n. 

(ii) Find the smallest n for which there exists a ternary [n, 5, 3]-code. 

5.19 (i) Let v be a nonzero vector in F' ? . Show that the set of vectors in 

¥ r q orthogonal to v, i.e., {v} x , forms a subspace of F^ of dimension 
r - 1. 

(ii) Let G be a generator matrix for the simplex code S(r, q). Show that, 
for a given nonzero vector v e F^, there are exactly (q r ~ 1 — 1 )/ (q — 1 ) 
columns c of G such that v • c = 0. 

(iii) Using the observation that S(r, q) = {vG : v € F^}, or otherwise, 
show that every nonzero codeword of S(r,q ) has weight q r ~ l . 
(Hint: Use (ii) to determine the number of coordinates of vG that 
are equal to 0.) 

5.20 Determine the Hamming weight enumerators of Ham(3, 2) and 5(3, 2). 
Verify that they satisfy the Mac Williams identity (see Exercise 4.49). 

5.21 The ternary Hamming code Ham(2, 3) is also known as the tetracode. 

(i) Show that the tetracode is a self-dual MDS code. 

(ii) Without writing down all the elements of Ham(2, 3), determine the 
weights of all its codewords. 

(iii) Determine the Hamming weight enumerator of Ham(2, 3) and show 
that the MacWilliams identity (see Exercise 4.49) holds for C = 
C x = Ham(2, 3). 

5.22 Let Q(, denote the hexacode defined in Exercise 4.10(b). 

(i) Show that Qf is a [6, 3, 4] -code over F 4 . (Hence, Qf is an MDS 
quaternary code.) 

(ii) Let Q' 6 be the code obtained from Qf by deleting the last coordinate 
from every codeword. Show that Q' 6 is a Hamming code over F 4 . 

5.23 (i) Show that the all-one vector (1, 1, . . . , 1) is in the extended binary 

Golay code G2 4 . 

(ii) Deduce from (i) that G2 4 does not have any word of weight 20. 

5.24 Prove Proposition 5.3.22. 

5.25 (i) Show that every word of weight 4 in Fj 3 is of distance 3 from exactly 

one codeword in the binary Golay code G 23. 

(ii) Use (i) to count the number of codewords of weight 7 in G 23. 

(iii) Use (ii) to show that the extended binary Golay code G2 4 contains 
precisely 759 codewords of weight 8. 

5.26 Show that the extended binary Golay code G2 4 has the weight distribution 
shown in Table 5.5 for its codewords. 

5.27 Verify the MacWilliams identity (see Exercise 4.49) with C = C ± = 
G24. 
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Table 5.5. 


Weight 

0 4 8 12 

16 

20 24 

Number of codewords 

1 0 759 2576 

759 

0 1 


5.28 Prove that the extended ternary Golay code G 12 is a [12, 6 , 6 ]-code. 

5.29 Show that the ternary Golay code G n satisfies the Hamming bound. 

5.30 Prove Theorem 5.5.3. (Hint: When d is even and n < 2d, mimic the 
proof of Theorem 5.5.2. Divide into the two cases M even and M odd, 
and maximize the expression X7=i J2asF 2 n Ua(M — n i,a) in each case. 
For the case of even d and n = 2d, apply Exercise 5.6, with q = 2, and 
the previous case. When d is odd, apply Theorem 5.1.11 with the result 
for even d.) 

5.31 Let C be the code over F 4 = {0, 1 , a, a 2 } with generator matrix 


l 0 1 

0 1 a 


(i) Show that C is an MDS code. 

(ii) Write down a generator matrix for the dual C (i) (ii) (iii) * * * * * * x . 

(iii) Show that C x is an MDS code. 

5.32 Show that the only binary MDS codes are the trivial ones. 

5.33 Suppose there is a < 7 -ary MDS code C of length n and dimension k, where 
k < n. 

(i) Show that there is also a q- ary MDS code of length n — I and 
dimension k. 

(ii) For a given 1 < i < n, let C, be the subcode of C consisting of all the 
codewords with 0 in the i th position, and let D, be the code obtained 
by deleting the / th coordinate from every codeword of C, . Show that 
D, is an MDS code. (Hint: You may need to show that there is at 
least one minimum weight codeword of C with 0 in the / th position.) 

5.34 For each n such that 9 < n < 16, compare the Singleton, Plotkin and 
Hamming upper bounds for A 2 (n, 9). 

5.35 Suppose there exists a binary linear code C of parameters [16, 8, 6 ], 

(i) Let C' be the residual code of C with respect to a codeword of weight 
6 . Show that C' is a binary linear code of parameters [10, 7, d'], 
where 3 < d' < 4. 

(ii) Use Exercise 5.32 to show that d' = 3. 
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(iii) Using the Hamming bound, or otherwise, show that such a C' cannot 
exist. 

5.36 A binary («, M, c/)-code C is cal led a constant-weight binary code if there 
exists an integer w such that wt(c) = w for all c e C. In this case, we 
say that C is a constant- weight binary (n, M,d\ w)-code. 

(a) Show that the minimum distance of a constant-weight binary code is 
always even. 

(b) Show that a constant- weight binary («, M,d\ u;)-code satisfies M < 

e 

(c) Prove that a constant- weight binary («, M, d; w)-code can detect at 
least one error. 

5.37 Let A 2 (n, d, w) be the maximum possible number M of codewords in a 
constant-weight binary ( n,M,d ; w)-code. Show that 

(a) 1 < A 2 {n, d, w ) < (2); 

(b) A 2 (n, 2, w) = ( n J; " 

(c) A 2 (n, d,w)=l for d > 2w\ 

(d) A 2 (n, d, w ) = A 2 {n, d,n — w). 

5.38 Use the Griesmer bound to find an upper bound for d for the 17 -ary linear 
codes of the following n and k: 

(a) q = 2, n — 10 and k = 3; 

(b) q = 3, n — 8 and k = 4; 

(c) q = A,n — 10 and k = 5; 

(d) q = 5, n = 9 and k = 2. 

5.39 For a prime power q and positive integers k and u with k > u > 0, 
the MacDonald code Ck, u is a q- ary linear code, of parameters [(q k — 
q u )/{q — 1 ), k, q k ~ l — q u ~ l ~\, that has nonzero codewords of only two 
possible weights: q k ~ l — q u ~ l and q k ~ x . Show that the MacDonald 
codes attain the Griesmer bound. 

5.40 LetC be an [n , k , <7]-code over andletc e C be a codeword of weight 
w, where w < dq/{q — 1). Show that the residual code Res(C, c) is an 
[n — w,k — 1 , d']-code, where d' > d — w + \w/q~\. 

5.41 Let C be a [q 2 , A,q 2 — q — l]-code over F ? . 

(i) By considering Res(C, c), where wt(c) = q 2 — t with 2 < t < q — 1 , 
or otherwise, show that the only possible weights of the codewords 
in C are: 0, q 2 — q — \ , q 2 — q , q 2 — \ and q 2 . 

(ii) Show the existence of a [q 2 + 1, 4, q 2 — < 7 ] -code over F ? . (Hint: 
Compare with Exercise 5.11.) 

5.42 Prove the properties of the Krawtchouk polynomials listed in 
Proposition 5.8.2. (Hint: For (ii), use the fact that 
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(l+(9- 1)-)" '(I - if = (1 - (l + gj| - ) ■ 

For (iv), multiply both sides of the equality by y k z l and sum over all 
k,i> 0. For (vii), use (ii). For (viii), use (i) by multiplying two power 
series together.) 

5.43 Show that the Krawtchouk polynomials satisfy the following recurrence 
relation: 

(k+l)K k+l (x) 

= (k + (q - 1)(« ~k)~ qx) K k (x) - (q - 1 )(n -k + l)K k _t(x). 

5.44 Show that K k (x) = ~ 

5.45 Let q =2. Show that: 

(a) K 0 (x ) = 1; 

(b) K l (x)=-2x+rr, 

(c) K 2 (x) = 2a: 2 - 2 nx + ("); 

(d) K 3 (x ) = — 4a 3 /3 + 2«a 2 - (n 2 - n + 2/3)* + (3). 

5.46 Let f be a primitive 17th root of unity in C. Suppose u e is a word of 
weight/. Show that 

£ r w = K k (i), 

weZ" 

wt(w)=t 

where, for u = (u 1 , . . . , u n ) and w = (iui, . . . , w n ), 11 w = u\W\ + 
■ ■ ■ + u n w n . 

5.47 Use the linear programming bound (Theorem 5.8.7) to show that the 
Hadamard matrix code of parameters (2d, Ad, d), with d even, is an op- 
timal code. (Hint: Use f(x) = 1 + K t (x) + \K 2 (x).) 

5.48 Let d be such that 2d > n . Use the linear programming bound 
(Theorem 5.8.7) to show that A 2 (n, d) < 2d/(2d — n). Note that this 
bound is slightly weaker than the Plotkin bound. (Hint: Use f(x) = 
1 + 




6 Constructions of linear 
codes 


For an (n, M, d )- code C over F 9 , theoretically we would like both 7 Z(C) = 
(log q M)jn and S(C) = (d — 1 )/n to be as large as possible. In other words, 
we want M to be as large as possible for fixed n and d. Of course, the ideal 
situation is to find codes with size equal to A q (n, d) for all given q, n and d. 
However, from the previous chapter, we know that it is still an open problem 
that seems difficult to solve. Fortunately, in practice, we are contented to use 
codes with sizes close to A q (n , d). In order to do so, we have to find ways to 
construct such codes. 

The construction of good codes has almost as long a history as coding 
theory itself. The Hamming codes in the previous chapter are one of the 
earliest classes of codes. Later on, many other codes which are also in practical 
use were invented: for instance, Reed-Muller codes (see Section 6.2); BCH 
codes (Section 8.1); Reed-Solomon codes (Section 8.2); and Goppa codes 
(Section 9.3). 

In this chapter, we concentrate mainly on the construction of linear codes. 
For the construction of nonlinear codes, the reader is advised to refer to ref. [13] 
and the following webpage maintained by Simon Litsyn of Tel Aviv University, 
E. M. Rains of IDA and N. J. A. Sloane of AT&T Labs-Research: 

http://www.research.att.eom/~ njas/codes/And/. 


6.1 Propagation rules 

In this section, we study several constructions of new codes based on old codes. 
Our strategy is to build codes with larger sizes or longer lengths from codes of 
smaller sizes or shorter lengths. From the beginning of coding theory, many 
propagation rules have been proposed, and some of them have become standard 


113 



114 


Constructions of linear codes 


constructions in coding theory. We feature a few well known propagation rules 
in this section and place some others in the exercises to this chapter. 

Theorem 6.1.1 Suppose there is an [n, k , d]-linear code over¥ q . Then 

(i) (lengthening) there exists an [n + r,k, d]-linear code over F q for any 
r > 1; 

(ii) (subcodes) there exists an [n, k — r, d]-linear code over F ? for any 1 < 
r <k - 1; 

(iii) (puncturing) there exists an [n — r, k, d — r]-linear code over ¥ q for any 
1 <r < d- 1; 

(iv) there exists an [n, k, d — r]-linear code over ¥ q for any 1 < r < d — 1; 

(v) there exists an[n—r,k — r, d]-linear code over F 9 for any 1 < r < k — 1 . 

Proof. Let C be an [n, k, d] -linear code over ¥ q . 

(i) By mathematical induction, it suffices to show the existence of an 
[n + 1 , k , d~\ -linear code over ¥ q . We add a new coordinate 0 to all the codewords 
of C to form a new code, 

{(mi, . .. ,m„, 0) : (Ml, .. . , m„) e C}. 

It is clear that the above code is an [n + 1 ,k, d]- linear code over F ? . 

(ii) Let c be a nonzero codeword of C with wt(c) = d. We extend c to form 
a basis of C: {ci = c, . . . , c*}- Consider the new code < {ci, . . . , Ck - r } > 
spanned by the first k — r codewords in the basis. It is obvious that the new 
code has the parameters [n,k —r,d], 

(iii) Let c e C be a codeword of weight d. For each codeword of C, we 
delete a fixed set of r positions where c has nonzero coordinates (compare 
with the proof of Theorem 5.1.11). It is easy to see that the new code is an 
[n — r, k, d — r]-linear code. 

(iv) The desired result follows from (i) and (iii). 

(v) If k = n, then we must have that d = 1. Thus, the space ¥ q ~ r is a code 
with parameters [n — r,k — r,d]. 

Now we assume that k < n. It also suffices to show the existence of an 
[n — 1, k — 1, d]- linear code for k > 2. Let C be an [n, k, ^{-linear code over 
F ? . We may assume that C has a parity-check matrix of the form 

H = (/„_, \X). 

Deleting the last column of H, we obtain an (n — k) x (« — 1) matrix H\. It is 
clear that all the rows of Hi are linearly independent and that any d — I columns 
of H i are linearly independent. Thus, the linear code with H\ as a parity-check 
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matrix has parameters [n — 1, k — 1, di\ with d\ > d. By part (iv), we have an 
[n — 1, k — 1, d] -linear code. □ 

Remark 6.1.2 In fact, the above theorem produces codes with worse 
parameters than the old ones. We usually do not make new codes using these 
constructions. However, they are quite useful when we study codes. 

The following result follows immediately from Theorem 6.1.1 (i)-(iv). 

Corollary 6.1.3 If there is an [n, k, d]-linear code over¥ q , then for any r > 0, 

0 < s < k — 1 and 0 < t < d — 1, there exists an [n + r, k — s, d — t]-linear 
code over F q . 

Example 6.1.4 A binary Hamming code of length 7 is a [7, 4, 3]-linear code. 
Thus, we have binary linear codes with parameters [n , 4, 3] for any n > 7 and 
also binary linear codes with parameters [7, k, 3] for any 1 < k < 4. 

Theorem 6.1.5 (Direct sum.) Let Cj be an [«, , ki , dj]-linear code over F q for 

1 = 1,2. Then the direct sum ofC\ and C 2 defined by 

Ci © C 2 = {(ci, C 2 ) : Ci e Ci, C 2 £ C 2 } 
is an\n 1 + « 2 , k\ + £ 2 , n\\n{d \ , d 2 }\-linear code over V q . 

Proof. It is easy to verify that Ci © C 2 is a linear code over V q . The length of 
Ci © C 2 is clear. As the size of Ci © C 2 is equal to the product of the size of 
Ci and that of C 2 , we obtain 

k := dim(Ci © C 2 ) = log 9 (|Ci © C 2 |) = log,(|Ci| • |C 2 |) = h + k 2 . 

We may assume that d\ < do. Letu £ Ci withwt(u) = d\. Then(u, 0) £ C. 
Hence, d(C 1 © Cf) < wt((u, 0)) = d\. On the other hand, for any nonzero 
codeword (ci, C 2 ) £ Ci © C 2 with Ci £ Ci and c 2 € C 2 , we have either Ci ^ 0 
or c 2 / 0. Thus, 

wt((ci, c 2 )) = wt(ci) + wt(c 2 ) > di. 

This completes the proof. 

Remark 6.1.6 Let G, be a generator matrix of C, , for / = 1 , 2. Then it is easy 
to see that the matrix 


Gi O' 
O G 2 
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is a generator matrix of Ci © C 2 , where 0 stands for the zero matrix (note that 
the two zero matrices have different sizes). 

Example 6.1.7 Let 

Cl = {000, 110, 101,011} 
be a binary [3, 2, 2]-linear code, and let 

C 2 = {0000, 1111} 

be a binary [4, 1, 4] -linear code. Then, 

Cl ©C 2 = {0000000, 1100000, 1010000,0110000, 

0001111, 1101111, 1011111,0111111} 

is a binary [7, 3, 2] -linear code. 

Theorem 6.1.8 ((u, u + v)-construction.) Let Ci be an [ n , £, , di\-linear code 
over F q , for i = 1,2. Then the code C defined by 

C = {(u, u + v) : u e Ci, v e C 2 } 

is a [2 n, k\ + min{2<i| , df\\-linear code over F q . 

Proof. It is easy to verify that C is a linear code over F q . The length of C is 
clear. 

It is easy to show that the map 

Ci © C 2 — > C, (ci, c 2 ) !->• (Cl, Cl + c 2 ) 

is a bijection. Thus, the size of C is equal to the product of the size of Ci and 
that of C 2 ; i.e., k := dimC = ki + £ 2 . 

For any nonzero codeword (ci, Ci + c 2 ) £ C with Ci € Ci and c 2 e C 2 , we 
have either ci / 0 or c 2 ^ 0. 

Case (1) c 2 = 0. In this case, we have Ci f 0. Thus, 

wt((ci, Ci + c 2 )) = wt((ci, Ci)) = 2wt(ci) > 2d\ > mm{2d \ , dj). 
Case (2) c 2 f 0. Then 

Wt((Ci, Ci + c 2 )) = Wt(Ci) + Wt(Ci + c 2 ) 

> wt(ci) + (wt(c 2 ) — wt(ci)) (see Lemma 4.3.6) 
= wt(c 2 ) >di> min{2J| , J 2 }- 
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This shows that d(C) > mm\2d\, ch}. On the other hand, let x e C \ andy e C 2 
with wt(x) = d 1 and wt(y) = d 2 . Then (x, x), (0, y) e C and 

d(C) < wt((x, x)) = 2d 1 

and 


d(C) < wt((0, y)) = d 2 . 

Thus, d(C ) < mm{2d \ , d 2 }. This completes the proof. □ 

Remark 6.1.9 Let G, be a generator matrix of C, , for i = 1 , 2. Then it is easy 
to see that the matrix 

(Gi G,\ 

\° G 2 ) 

is a generator matrix of C from the (u, u + v)-construction in Theorem 6.1.8, 
where O stands for the zero matrix. 

Example 6.1.10 Let 

Cl = {000, 110, 101,011} 
be a binary [3, 2, 2]-linear code, and let 

C 2 = {000, 111} 

be a binary [3, 1, 3]-linear code. Then, 

C = {000000, 110110, 101101,011011, 

000111, 110001, 101010,011100} 

is a binary [6, 3, 3]-linear code. 

Let 1 = (1,1 1) denote the all-one vector and let 0 = (0, 0, . . . , 0) 

denote the zero vector. (The length is unspecified here and depends on the 
context.) 

Corollary 6.1.11 Let A be a binary [n, k, d]-linear code. Then the code C 
defined by 

C = {(c, c) : ceAjU {(c, 1 + c) : ceAj 


is a binary [2 n,k + X, min{«, 2d}]- 1 inear code. 

Proof. In Theorem 6. 1.8, taking Ci = A andC 2 = {0, 1}, we obtain the desired 
result. "A t® 
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Example 6.1.12 Let 

A = {00,01, 10, 11} 
be a binary [2, 2, l]-linear code. Put 

C = {(c, c) : c e A] U {(c, 1 + c) : c e A] 

= {0000,0101, 1010, 1111,0011,0110, 1001, 1100}. 

Then C is a binary [4, 3, 2]-linear code. 


6.2 Reed-Muller codes 

Reed-Muller codes are among the oldest known codes and have found 
widespread applications. For each positive integer m and each integer r satis- 
fying 0 < r < m, there is an rth order Reed-Muller code 7 2(r, m), which is 
a binary linear code of parameters [2 m , ("') + (”) + ■•■ + (”')> 2" !-r ]. In fact, 
7^.(1, 5) was used by Mariner 9 to transmit black and white pictures of Mars 
back to the Earth in 1972. Reed-Muller codes also admit a special decoding 
called the Reed decoding. There are also generalizations to nonbinary fields. 
We concentrate mainly on the first order binary Reed-Muller codes. 

There are many ways to define the Reed-Muller codes. We choose an 
inductive definition. Remember that we are in the binary setting. 

Definition 6.2.1 The(firstorder)Reed-MullercodeslZ(l,m) are binary codes 
defined, for all integers m > 1, recursively as follows: 

(i) 72(1, l) = Ff = {00,01,10, 11}; 

(ii) form > 1, 

72.(1. m + 1) = {(u. u) : ue 72(1, m)} U {(u, u + 1) : ue72(l,m)}. 

Example 6.2.2 72(1,2) = {0000,0101, 1010, 1111,0011,0110, 1001, 1100}. 
A generator matrix of 72(1, 2) is 

/! ! ! !\ 

0 1 0 1 . 

\0 0 1 1/ 

Proposition 6.2.3 For m > 1, the Reed-Muller code 72(1, m) is a binary 
[2 m , m + 1, 2 m ~ 1 ]-linear code, in which every codeword except 0 and 1 has 
weight 2 m_1 . 
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Proof. It is clear that 72.(1, 1) is a binary [2, 2, l]-linear code. We note 
that 72(1, m) is obtained from 72(1, m — 1) by the construction in Corollary 
6.1.11. Using mathematical induction, we assume that 72(1, m — 1) is a 
binary [2 2 m_2 ]-linear code. By Corollary 6.1.11, 72(1, m) is a binary 
[2 • 2 m ~\m + 1, min{2 • 2 m ~ 2 , 2 m ~ 1 }] = [2 m ,m + l, 2 m - 1 ]-linear code. 

Now we prove that, except for 0 and 1, every codeword of 72(1 , m + 1) has 
weight 2 m = 2 (m+1) - 1 .. 

A word in 72(1, m + 1) is either of the type (u, u) or (u, u + 1), where u is 
a word in 72(1, m). 

Case( 1) (u,u), where u e 72(1, m): u can be neither 0 nor 1, since otherwise 
(u, u) is again the zero or the all-one vector. Hence, by the inductive hypothesis, 
u has weight 2 m_1 . Therefore, (u, u) has weight 2 • 2 m_1 = 2 m . 

Case (2) (u, u + 1), where u e 72(1, m): 

(a) If u is neither 0 nor 1, then it has weight 2 m_1 ; i.e., exactly half its coordi- 
nates are 1. Hence, half of the coordinates of u + 1 are 1; i.e., the weight 
of u + 1 is also 2 m_1 . Therefore, the weight of (u, u + 1) is exactly 2 m . 

(b) If u = 0, then u + 1 = 1, so again the weight of (0, 0 + 1) is 2 m . 

(c) If u = 1, then u + 1 = 0, so the weight of (1, 1 + 1) is again 2"'. . 


Proposition 6.2.4 (i) A generator matrix of 72( 1 , 1) is 



(ii) If G m is a generator matrix for 72(1, m ), then a generator matrix for 
72(1, m + 1) is 


G m + 1 




Proof , (i) is obvious, while (ii) is an immediate result of Corollary 6.1.1 1 and 
Remark 6. 1.9. -.S 


Example 6.2.5 Using the generator matrix 


'1 1 1 1 \ 
0 10 1 
,0 0 11 / 


G 2 = 



120 


Constructions of linear codes 


G 3 = 


/I 1 1 1 1 1 1 1\ 

0 10 10 10 1 
0 0 1 1 0 0 1 1 

\0 0 0 0 1 1 1 1 / 


Proposition 6.2.6 The dual code 1Z( \ , m) L is (equivalent to) the extended 
binary Hamming code Hamfm, 2). 

Proof. From Proposition 6.2.4(ii), starting with 
G\ k I 


it is clear that G m is of the form 


Z 1 

1 - - 1\ 

0 

H m 

\o 

/ 


where H m is some matrix. Moving the first coordinate to the last and moving 
the first row of the matrix to the last, we obtain the following generator matrix 
G' m for an equivalent code: 


/ 

°\ 


0 

\1 • 1 

V 


Using Theorem 5.1.9, if we show that H m is a parity-check matrix for 
Ham (m, 2), then G' m is the parity-check matrix for Ham(m, 2), so 1Z( 1 , m) 1 - is 
equivalent to Ham(m, 2). 

To show H m is a parity-check matrix for Ham(/n, 2), we need to show that 
the columns of H m consist of all the nonzero vectors of length m. 

Indeed, when m = 1 , 2, the columns of H m consist of all the nonzero vectors 
of length m. Now suppose that the columns of H m consist of all the nonzero 
vectors of length m, for some m . By the definition of G m , it follows readily 
that the columns of H m+ \ consist of the following: 
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where c is one of the columns of H m and 0 is the zero vector of length m. It 
is clear that the vectors in this list make up exactly all the nonzero vectors of 
length m + 1 . Hence, by induction, the columns of H m consist of all the nonzero 
vectors of length m . □ 

Finally, we define the rth order Reed-Muller codes. 

Definition 6.2.7 (i) The zeroth order Reed-Muller codes 72.(0, m), for m > 0, 
are defined to be the repetition codes { 0 , 1 } of length 2 m . 

(ii) The first order Reed-Muller codes 72(1 , m), for m > 1, are defined as in 
Definition 6.2.1. 

(iii) For any r > 2, the rth order Reed-Muller codes 72(r, m) are defined, 
form > r — 1, recursively by 

I Ff if m = r - \ 

{(u, u + v) : u e 72(r, m), 
v e 72(r — 1, m)} if m > r — I. 


6.3 Subfield codes 

For the two previous sections, we made use of codes over F ? to construct new 
ones over the same ground field. In this section, we will employ codes over an 
extension F q m to obtain codes over F ? . 

Theorem 6.3.1 (Concatenated code.) Let A be an [N , K, D]-linear code over 
F q m. Then there exists an [nN,mK,d']-linear code C over F ? with d' = 
d(C ) > dD, provided that there is an [n, m. d]-linear code B over F q . More- 
over, an [ nN , mK , dD]-Iinear code over F 9 can be obtained. 

Proof. As F q m can be viewed as an V q -vector space of dimension m, we set up 
an F ? -linear transformation 0 between F q m and B such that <p is bijective. 

We extend the map 0 and obtain a map 

0*: F qm — > F n q N , {v x ,...,v N )v+ . . . ,</>(v N )). 

It is easy to see that 0* is an F q -linear transformation from F^ m to F"' v . The 
map 0* is one-to-one (but not onto unless n = m). 

The code A is an F ? -subspace of F qm . Let C be the image of A under 
0*; i.e., C = 4>*(A). Then C is a subspace of F n q N since 0* is an F^-linear 
transformation. 

The length of C is clearly nN . To determine the dimension of C, we recall 
a relationship between the size of a vector space V over F,. and its dimension 
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(see Theorem 4.1.15(i)): 

dim Fr V = log,. V | or L| = r diniF ' v . (6.1) 

Thus, we have 

dimp, C = log ? | C | 

= log ? A (as (/)* is one-to-one) 

* log, ((^) di "V 4 ) 

f log ? q mK = mK. 

Finally, we look at the minimum distance of C. Let {u \, . . . , u^) be a 
nonzero codeword of A. If u, ^ 0 for some 1 < i < N, then (piu,) is a nonzero 
codeword of B. Hence, wt (0(«,-)) > d. As («i, . . . , u^) has at least D nonzero 
positions, the number of nonzero positions of (0(ui), . . . , 4>(un )) is at least r/D. 

By Theorem 6.1.1(iv), we obtain an [nN, mK, dD] -linear code over F ? . 
This completes the proof. □ 

The code A in Theorem 6.3.1 is called the outer code, while the code B in 
Theorem 6.3.1 is called the inner code. 

In Theorem 6.3.1, let B = F™ be the trivial code with the parameters 
[m,m, 1], We obtain the following result. 

Corollary 6.3.2 We have an [mN , mK, D]-linear code over¥ q whenever there 
is an [N , K, D\-linear code over ¥ q m . 

Example 6.3.3 (i) We know that there exist a [17, 15, 3]-Hamming code 
over F 16 and a binary [8, 4, 4]-linear code (see Proposition 5.3.10). By 
Theorem 6.3.1, we obtain a binary [136, 60, 12]-linear code. 

(ii) We have an [(8 3 - l)/(8 - 1), (8 3 - l)/(8 - 1) - 3, 3] = [73, 70, 3]- 
Hamming code over Fg. Thus, we obtain a binary [219, 210, 3]-linear code by 
Corollary 6.3.2. 

Example 6.3.4 (i) Consider the linear code 

A := {(0, 0), (1, a), (a, 1 + a), (1 + a, 1)} 
over F4, where a is a root of 1 + x + x 2 . Let B be the binary code 
[000, 110, 101,011}, 

and consider the F2-linear transformation between F4 and B defined by 
(j) \ 0 1 — > 000, 1 1 — > 110, ot 1 — > 101 , 1 T of 1 — > OIL 
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Then we obtain the code 

C := 4>*(A) = {000000, 110101, 101011,011110}. 

The new code C has parameters [6, 2, 4], 

(ii) Let a be a root of 1 + x + x 3 e F2M. Then, Fg = F2[a]. By Exercise 
4.4, {1,0!, a 2 } forms a basis of Fg over F2. Consider the map : Fg — > F, 

a\ ■ 1 + 02 ■ o! + #3 • or !->■ (ai, ci 2 , a-}). 

Let A =< (o', a + 1, 1) > /Fg. By Corollary 6.3.2, C := <f>*(A) is a binary 
[9, 3, ri]-linear code, where d > 3. We list all the elements of A: 

A = {(0, 0, 0), (a, a + 1, 1), (a 2 , a 2 + a, a ), 

(a + 1, a 2 + a + 1, a 2 ), (a 2 + a, a 2 + 1, a + 1), 

(a 2 + a + 1, 1, a 2 + a), (a 2 + 1, a, a 2 + a + 1), (1, a 2 , a 2 + 1)}. 
Therefore, 

c =(I>*(A) = { 000000000 , 010110100 , 001011010 , 110111001 , 
011101110 , 111100011 , 101010111 , 100001101 }. 

Thus, C is in fact a binary [9, 3, 4]-linear code. 

Any vector space V over F q m can be viewed as a vector space over F (/ . In 
particular, F^L is a vector space over F 9 of dimension m N . This view brings 
out another construction. 

Theorem 6.3.5 (Subfield subcode.) Let C be an [N, K, D]-linear code over 
F q m. Then the subfield subcode C|f, := C nFj' is an [n, k, d]-linear code 
over ¥ q with n = N, k > mK — (m — l)N and d > D. Moreover, an 
[N,mK — (m — 1 )N, D]-linear code over F ? can be obtained provided that 
mK > (m — l)N. 

Proof. It is clear that C \ ^ is a linear code over F 9 as both C and F^ can be 
viewed as ¥ q -subspaces of F^L . 

The length of C |f, is clear. For the dimension, we have 

dim F< C |f, = dim F , (CnFj') 

= dim F , C + dim F , F^ - dim F , (C + F^) 

> log ? \C\+N - dim F , (F^L) 

(as C + ¥% is an F^ -subspace of F^L) 

= log q (q m ) K + N — log ? q mN 
= mK +N -mN = mK - (m - l)N. 
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As C |f, is a subset of C, it is clear that the minimum Hamming weight of C | F? 
is at least the minimum Hamming weight of C ; i.e., d(C\$ q ) > d(C) = D . 
Applying Corollary 6.1.3 gives the desired result on the second part. □ 

Example 6.3.6 Let a be a root of 1 + x + x 2 e F 2 [jt]. Then F 4 = F 2 [a!]. Let 

C =< {(a, 0, 0), (0, a + 1, 0)} > /F 4 . 

Thus, by Theorem 6.3.5, C |p 2 is a binary [3, k, d]-linear code with 

k > mK — (m — l)N = 2 • 2 — (2 — 1) • 3 = 1, d > d(C) = 1. 

We list all the elements of C : 

C = {(0, 0, 0), (a, 0, 0), (1, 0, 0), (a + 1, 0, 0) 

(0, a + 1, 0), (0, a, 0), (0, 1, 0), (a, a + 1, 0) 

(a, a, 0), (a, 1, 0), (1, a + 1, 0), (1, a, 0) 

(1, 1, 0), (a + 1, a + 1, 0), (a + 1, a, 0) 

(a +1,1,0)}. 

It is clear that C |p 2 = C IT F^ = {000, 100, 010, 1 10}. Hence, C |f 2 is in fact a 
binary [3, 2, l]-linear code. 

For the final construction in this section, we need the results of Exercise 4.5. 

Theorem 6.3.7 (Trace code.) Let C be an [N , K , D]-linear code over F q m. 
Then the trace code of C defined by 

Tr FW F,(C) := {(Tr F?m/F? (ci), . . . , Tr v/F ,(c„)) : (ci, . . . , c n ) e C) 

is an [n, k]-linear code over F 9 with n = N and k < mK. 

Proof. Since Tr F?OT / F< is an F 9 -linear transformation from F q m to F ? , the set 
TV, , m /F g {C ) is a subspace of V" q . It is clear that the length of Tr F , m / F ,(C) is n. 
For the dimension, we have 

dim F , Tr F?m / F? (C) = log (/ |Tr F?m / F? (C)| 

< log 9 ICj = log q (q m f mF " mC 
= log q {q mK ) = mK. 

□ 

Example 6.3.8 Consider the code C — {a(1 . a, a + 1) : a e F 9 } over F 9 , 
where a is a root of 2 + x + x 2 e F 3 [x]. Then 
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C = {(0, 0, 0), (a, 1 + 2a, 1), (1 + 2a, 2 + 2a, a), 

(2 + 2a, 2, 1 + 2a), (2, 2a, 2 + 2a), (2a, 2 + a, 2), 

(2 + a, 1 + a, 2a), (1 + a, 1, 2 + a), (1, a, 1 + a)}. 

Under the trace map Trp 9 /F 3 , we have 

(0, 0, 0) i-» (0, 0, 0), (a, 1 + 2a, 1) i-» (2, 0, 2), 

(1 + 2a, 2 + 2a, a) i-» (0, 2, 2), (2 + 2a, 2, l + 2a)\-> (2, 1, 0), 

(2, 2a, 2+ 2a) i-» (1, 1,2), (2a, 2 + a, 2) i-» (1,0, 1), 

(2 + a, 1 + a, 2a) i-» (0, 1, 1), (1 + a, 1, 2 + a) i-» (1, 2, 0), 

(l, a, 1 +a) i-» ( 2 , 2 , 1 ). 

Hence, the trace code 

Tr F9 /F 3 (C) = {(0, 0, 0), (2, 0, 2), (0, 2, 2), (2, 1, 0), (1, 1, 2), 

( 1 , 0 , 1 ),( 0 , 1 , 1 ), ( 1 , 2 , 0 ), ( 2 , 2 , 1 )} 
is a [3, 2 , 2 ]-linear code over F 3 . 

In fact, trace codes are none other than subfield subcodes. This is shown by 
the following result. 

Theorem 6.3.9 (Delsarte.) For a linear code C over F q m , one has 
(C'If ,) 1 = TrF W F,(C L ). 

Proof. In order to prove that (C | F ,) X 3 Tr F „/f, (C x ), we have to show that 
c • Tr F?m / f, ( a) = 0 for all a € C x and c e C | F , . 

Write c — (ci, . . . , c„) and a = (a\, . . . , a„); then 


= Tr F?m /F , (c • a) = 0. 

We have used here the F ? -linearity of the trace and the fact that c • a = 0. 
Next, we show that (C |f 9 ) x 9 TrF 9 „,/ F? (C x ). This assertion is equivalent to 

(Tr F 9 m /F,(C x )) ± C C\ f„. 

Suppose the above relationship does not hold, then there exist some 
u e (Tr F?m /f, (C x )) "*" \ C | F , 


and v e C x with u • v / 0. As Trp /f is not the zero-map (see Exercise 4.5), 
there is an element y e F q m such that TrF ?m /F 9 (y (u • v)) / 0. Hence, 

u • Tr F</Fi; (yv) = TrF 4 „,/F,(u • y\) = Tr F? „/F 9 (y(u • v)) / 0. 
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But, on the other hand, we have u ■ Tr F?m / F? (yv) = 0 because u e 
(Tr F „ /F (C ± )) ± and y v e C 1 . The desired result follows from this 
contradiction. □ 

The above theorem shows that trace codes can be obtained from subfield 
subcodes. 

Example 6.3.10 As in Example 6.3.8, consider the code C = {/.( 1 , a. a + \ ) : 
k e F9} over F9, where a is a root of 2 + x + x 2 e F3 [jc]. Then, by Theorem 
6.3.9 and Example 6.3.8, we have 

C- l |f 3 = (Tr F9 / F3 (C)) x 

= {(0, 0 , 0), (2, 0 , 2), (0, 2, 2), (2, 1 , 0), (1, 1 , 2), 

( 1 , 0 , 1 ),( 0 , 1 , 1 ), ( 1 , 2 , 0 ), ( 2 , 2 , l)}- 1 
= {( 0 , 0 , 0 ),( 1 , 1 , 2 ), ( 2 , 2 , 1 )}. 


Exercises 

6.1 (a) Given an [ n , k, d]-linear code over F q , can one always construct an 

[n + 1, k + 1, d] -linear code? Justify your answer. 

(b) Given an [n,k, d] -linear code over F ? , can one always construct an 
[n + \,k,d + 1] -linear code? Justify your answer. 

6.2 Let C be a <7 -ary [n, k, <?]-linear code. For a fixed 1 < i < n, form the 
subset A of C consisting of the codewords with the / th position equal 
to 0. Delete the /th position from all the words in A to form a code D. 
Show that D is a q- ary [n — 1 ,k', d']-linear code with 

k — 1 < k’ < k, d! >d. 

(Note: this way of obtaining a new code is called shortening.) 

6.3 Suppose that 

/I 1 1 1 0\ 

G = 0 1 0 1 1 

\0 0 1 1 1/ 

is a generator matrix of a binary code C. Find a generator matrix of A 
with respect to / = 2 using the construction in Exercise 6.2. 

6.4 Let Hi be a parity-check matrix of C, , for / = 1 , 2. 

(a) Find a parity-check matrix of C\ ® C2 and justify your answer. 
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(b) Find a parity-check matrix of the code obtained from the (u, u + v)- 
construction and justify your answer. 

6.5 (i) Let A = {0000, 1100, 0011, 1111} be a binary code. Find the code 

C constructed from A using Corollary 6.1.11. 

(ii) Let H be a parity-check matrix of A in (i). Find a parity-check matrix 
of C constructed from A using Corollary 6.1.11. 

6.6 Assume that q is odd. Let C, be an [n. kj,dj]- linear code over F ? , for 
i — 1,2. Define 

Ci 0 C 2 := {(ci + C 2 , Ci — C 2 ) : Ci e Ci, C 2 € C 2 }. 

(a) Show that Ci 0 C 2 is a [2 n, k\ + ^-linear code over V q . 

(b) If G, is a generator matrix of C, , for i = 1,2, find a generator matrix 
for Ci 0 C 2 in terms of Gi and G 2 . 

(c) Let d be the distance of Ci 0 C 2 . Show that d = 2 d 2 if 2a?2 < d\ and 
d\ <d < 2d2 if 2d2 > d\. 

6.7 Let Ci be an [n, £,-,</,■] -linear code over F ? , for i = 1, 2. Define 

C := {(a + x, b + x, a + b + x) : a, b e Ci , x e C 2 }. 

(a) Show that C is a [3«. 2k \ + £ 2 ]-linear code over F ? . 

(b) If Gi is a generator matrix of C, , for i — 1,2, hnd a generator matrix 
of C in terms of G 1 and G 2 . 

(c) If Hi is a parity-check matrix of C,, for i = 1,2, hnd a parity-check 
matrix of C in terms of H 1 and TL. 

6.8 (a) Find the smallest n such that there exists a binary [n, 50, 3]-linear 

code. 

(b) Find the smallest n such that there exists a binary [n, 60, 4]-linear 
code. 

6.9 Find the smallest n such that there exists an [n. 40, 3]-linear code 
over F9. 

6.10 (i) Write down the codewords in TZ(l, m ) for m = 3,4, 5. 

(ii) Verify that TZ (\ , 3) is self-dual. 

6.11 Show that 1Z(r, m) has parameters [2 m , (”) + ("') H + (™), 2 m ~ r ], 

6.12 For 0 <r < m, show that 1Z(r, m) L = TZ(m — 1 — r, m). 

6.13 Write the binary solutions of the equation 

xi + ■ ■ ■ + x m = 1 

as column vectors of F™ . Let vi , . . . , v„ be all the solutions of the above 
equation. Let C m be the binary linear code with 


G = (Vi v„) 
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as a generator matrix. 

(i) Determine all the codewords of C m , for m = 2, 3, 4. 

(ii) Find the parameters of C m for all m. 

6.14 For a linear code V over F ? , the parameters of V are denoted by 

length(L), dim(V ) and d(V ):=minimum distance. 

Suppose we have 

(1) a code C with length (C) = m and dim(C) = k, and 

(2) a collection of k codes W], , W^, all of them having the same 
length n. 

The elements of C are written as row vectors, and the elements of Wj 
are written as column vectors. We fix a basis [c (1 \ . . . , c (k) \ of C and 
denote by G the k x m matrix whose rows are c (l) , .... c ik> . Thus, G is 
a generator matrix of C. For 1 < j < k, we set 

Cj :=<{c (1) ,...,c°' ) }>cF" ! . 

Then Cj is a q- ary code of length m and dimension j . Moreover, 

Ci c C 2 C • • • C C k = C. 

Let M be the set consisting of all the n x k matrices whose yth column 
belongs to Wj, for all 1 < j < k. 

(i) Show that M is an F (/ -linear space of dimension dim(Wy). 

(ii) If we identify an ny.m matrix A with a vector a of F""' by putting the 
ith row of A in the z th block of m positions of a, then the <7 -ary linear 
code 


W := {AG : A & M] 

has parameters 

length (W) = mn, 

dim(W) = Y k j=i dim (Wj), 

d(W) > min [d{Wj) ■ d(Cj ) : 1 < j <k}. 

(iii) By using the binary codes with parameters [2, 1, 2], [20, 19, 2] and 
[20, 14, 4], show that we can produce a binary [40, 33, 4]-linear code. 

6. 15 (i) Show that there always exists an [n , n — 1 , 2]-linear code over F (/ for 

any n >2. 

(ii) Prove that there is an [nN , (n — 1 )K, 2D]-linear code over V q when- 
ever there is an [N, K, D]-linear code over F ? »-i . 

6.16 Let a be a root of 1 + x 2 + x 3 e F2[x]. Consider the map 

<p : Fg — »■ Fj, a\ ■ 1 + z?2 • 01 + a-j, • a 2 i->- {a 1, 02 , 03 ). 
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Let A =< (a + 1, a 2 + 1, 1) > /Fg. Determine all the codewords of 
4>\A) = {(0(d), 0(d), 0(d)) : (d, c 2 , c 3 ) £ A}. 

6.17 Consider the linear code 


A :=< {(1, 1), (a, 1 + a)} > 

over F 4 , where a is a root of 1 + x + x 2 £ F 2 [jc]. Let B be the binary 
code {0000, 1 100, 1010, 01 10} and consider the F 2 -linear transformation 
between F 4 and B defined by 

0 : Oh+ 0000 , 1 1100 , a 1010 , 1 -Her 0110 . 

Determine all the codewords of the code 

C := 0*(A) = {(0(d), 0(c 2 )) : (d, d) e A}. 

6.18 Let Ham(m, 4) be a Hamming code of length (4™ — l)/3 over F 4 . Using 
Theorem 6.3.5, estimate the parameters of Ham (m, 4)|f 2 . Find the exact 
parameters of Ham(3, 4)|f 2 . 

6.19 LetC =< {(1, cr, a 2 ), (a 2 , a, 0)} > be a linear code over F 4 , where a is a 
root of 1 + x + x 2 £ F 2 [x], Determine all the codewords of C|f 2 . 

6.20 (i) Suppose that ui, u, are vectors of F{'. Show that the set 

{u 1 , . . . , u, } is F ? -linearly independent if and only if it is F q m -linearly 
independent for all m > 1 . 

(ii) Show that, for an [N , K ] -linear code C over F q m , the subfield subcode 
C|f, has dimension at most K . Moreover, show that dimp,(C|F,) = 
K if and only if there is an F 9 » -basis {ci, . . . , Cjf} of C such that 
c, £ F^ for all / = 1 

6.21 Show that 


diniF ? (Trp <m /F ? (C)) > dimF ?m (C) 
for any linear code C over F q m . 

6.22 (i) Show that, for a polynomial f(x) £ F 9 [x] and an element a £ F ? 2 , 

one has 

(/(«)) ? + /(«)eF, and (/(«))« +1 £ F,. 

(ii) Show that the set 

S m = {x i(q+1 \x jq + x J ) : i(q + 1) + jq < (q + 1 )m] 

has {m + 1 )(m + 2)/2 elements. Moreover, show that the vector 
space V m =< S m > spanned by S m over F 2/ has dimension \S m \ = 
C m + 1 )(m + 2)/2. 
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(iii) For an element fi € F ? 2 \F 9 , we have / p. Thus, we can label 
all the elements of F ? 2 \F ? as follows: 

F 9 AF 

where n = (q 2 — q)/ 2. Show that, for m < q — 1, the code 

C m = • • • , g<fin)) ■■ g£V m ] 

is an [n, (m + 1 )(m + 2)/2, J]-linear code over ¥ q with d > n — 
m(q + l)/2. 

6.23 Let A be an [N, K, £)]-linear code over ¥ q m and let B be an [n, m, d\- 
linear code over ¥ q . We set up an F 9 -linear transformation 0 between 
F q m and B such that 0 is bijective. We extend the map 0 and obtain a 
map 

0*: F^, — > ¥ n q N , (ui,...,Vjv)i-> (0(vi), ...,0(%)). 

Show that the code 

(«'f := {(ci, . . . , cjv) : c, e B 1 } 

is contained in (0*(A)) X . 

6.24 Let C be a q m -ary linear code. Show that 

dim F?m (C ) — (m — 1 )(n — dimp ? „ (C)) < dimp^(C|F 4 ) < dim F? „, (C ) 
and 

dim F?m (C) < dim^(TrK ? ,„/F,(C)) < m ■ dim F<m (C). 

6.25 Let C be a q m -ary linear code of length n and let U be an ¥ q m -subspace 
of C with the additional property U q CC, where 

U“ = {(«?,..., m«): (m,...,M„)et/}. 

Show that 

dim F? (Tr F<m /F? (C )) < m(dim F?m (C) - A\m Vqm (U)) + dim F? (f/| Fi? ). 

6.26 Let C be a q m - ary linear code of length n and let V be an ¥ q m -subspace 
of C 1 - with the additional property V q c C 1 . Show that 

dim F(; (C| F ^) > dim F( „ (C) — (m — 1 ){n — dim F( ,„ (C) — dim F/ „(L)). 

6.27 Let C be a q m -ary linear code of length n. Show that the following three 
conditions are equivalent: 

(i) C“ = C; 

(ii) dim F< (C| Fs ) = dim F<m (C); 

(iii) Tr v/F9 (C) = C| F ,. 
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6.28 (Alphabet extension.) Let s and r be two integers such that s > r > 1. 
We embed an alphabet A of cardinality r into an alphabet B of cardinality 
s. For an (n, M, d)-code C over A, consider an embedding 

C ^ A n ^ B n . 

The code C can be viewed as a subset of B" and therefore a code over 
B. Show that the code C still has parameters ( n,M,d ) when viewed as 
a code over B . 

6.29 Let r and s be two integers bigger than 1. Let Ci be an (n, M\,d \ )-code 
over Z r , and let C 2 be an (n, M 2 , d 2 )-code over L s . We embed Z r (Z s , 
respectively) into Z rs by mapping (i (mod r)) e Z r ((i (mod ,s)) e Z s , 
respectively) to (i (mod rs)) e Z rs . Then both C\ and C 2 can be viewed 
as codes over Z rs . Show that the code 

Ci + rC 2 := {a + rb G Z" s : a £ Cj, b G C 2 [ 

is an ( n , M 1 M 2 , minfdi, d 2 })-code over Z rs . 

6.30 (Alphabet restriction.) Let ,s and r be two integers such that s > r > 1. 
We embed an alphabet A of cardinality r into Z s . For an (n, M, d )- code 
C over Zj, consider all the s n shifts 

C ¥ := {v + c : c G C} 

for all v G Z”. Show that there exists a vector vo G Z" such that the 
intersection C Vo fl A" is an r - ary (n. M', d ')- code with M' > M{r/s) n , 
and d' > d. 
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In the previous chapters, we concentrated mostly on linear codes because they 
have algebraic structures. These structures simplify the study of linear codes. 
For example, a linear code can be described by its generator or parity-check 
matrices; the minimum distance is determined by the Hamming weight, etc. 
However, we have to introduce more structures besides linearity in order for 
codes to be implemented easily. For the sake of easy encoding and decoding, one 
naturally requires a cyclic shift of a codeword in a code C to be still a codeword 
of C. This requirement looks like a combinatorial structure. Fortunately, this 
structure can be converted into an algebraic one. Moreover, we will see that 
a cyclic code of length n is totally determined by a polynomial of degree less 
than n. 

Cyclic codes were first studied by Prange [17] in 1957. Since then, algebraic 
coding theorists have made great progress in the study of cyclic codes for both 
random-error correction and burst-error correction. Many important classes of 
codes are among cyclic codes, such as the Hamming codes, Golay codes and 
the codes in Chapters 8 and 9. 

We first define cyclic codes in this chapter, and then discuss their algebraic 
structure and other properties. In the final two sections, a decoding algorithm 
and burst-error-correcting codes are studied. 


7.1 Definitions 

Definition 7.1.1 A subset S of F” is cyclic if (a„_i, cto, a\, . . . , a„_ 2 ) £ S 
whenever (c/q, a 1 , . . . , a„- 1 ) e S. A linear code C is called a cyclic code if C 
is a cyclic set. 

The word (w„_ r , . . . , «„_ i , uq, u\, . . . , 1 ) is said to be obtained from 

the word (mq, . . . , u„-\ ) £ F" by cyclically shifting r positions. 
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It is easy to verify that the dual code of a cyclic code is also a cyclic code 
(see Exercise 7.2.). 

Example 7.1.2 The sets 

{(0, 1, 1,2), (2,0, 1, 1), (1,2,0, 1), (1, 1,2,0)} C Fj, {11111} cF’ 
are cyclic sets, but they are not cyclic codes since they are not linear spaces. 

Example 7.1.3 The following codes are cyclic codes: 

(i) three trivial codes {0}, {A. • 1 : X e F 9 } and F"; 

(ii) the binary [3, 2, 2]-linear code {000, 1 10, 101, 011}; 

(iii) the simplex code S( 3, 2) = {0000000, 101 1 100, 0101 1 10, 0010111, 
1110010,0111001, 1001011, 1100101}. 

In order to convert the combinatorial structure of cyclic codes into an alge- 
braic one, we consider the following correspondence: 

tv : F^ — >¥ q [x]/(x n -\), (a 0 ,au a 0 +aix-\ \-a n -ix"~ l . 

(7.1) 

Then n is an F ? -linear transformation of vector spaces over F ? . From now 
on, we will sometimes identify F” with F 9 [x]/(x" — 1), and a vector u = 
(no, «i, ... , m„_ i) with the polynomial u(x ) = Xm=o u > x> • Prom Theorem 
3.2.6, we know that F 9 [x]/(x" — 1) is a ring (but not a field unless n = 1). 
Thus, we have a multiplicative operation besides the addition in ¥" . 

Example 7.1.4 Consider the cyclic code C — {000, 110, 101,011}; then 
7T(C) = {0, 1 + X, 1 + X 2 , X + X 2 } C F 2 [x]/(x 3 - 1). 

Now we introduce an important notion in the study of cyclic codes. 

Definition 7.1.5 Let R be a ring. A nonempty subset I of R is called an ideal 
if 

(i) both a + b and a — b belong to I, for all a, b e /; 

(ii) r ■ a e /, for all r e R and a e I. 

Example 7.1.6 In the ring F 2 [x]/(x 3 — 1), the subset 
/ := {0, 1 +x,x +x 2 , 1 +x 2 } 


is an ideal. 
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Example 7.1.7 (i) In the ring Z of integers, all the even integers form an ideal. 

(ii) For a fixed positive integer m, all the integers divisible by m form an 
ideal of Z. 

(iii) In the polynomial ring F ? [jc], for a given nonzero polynomial f(x), all 
the polynomials divisible by f(x) form an ideal. 

(iv) In the ring F q [x]/{x n — 1), for a divisor g(x) ofx n — I , all the polynomials 
divisible by g(x) form an ideal. 

Definition 7.1.8 An ideal I of a ring R is called a principal ideal if there exists 
an element g e I such that I =< g >, where 

c g >:= {gr : r e R}. 

The element g is called a generator of I and I is said to be generated by g. 

A ring R is called a principal ideal ring if every ideal of R is principal. 

Note that generators of a principal ideal may not be unique. 

Example 7.1.9 In Example 7.1.6, the ideal I is principal. In fact, I — 
< 1 + x >. Note that 

0(1 +x) = 1+x 3 = 0 = (l+x+x 2 )(l+x), 

1 -(1 +x) = l+x=(x+x 2 )(l+x), 
x-(l+x) = X + x 2 = (1 + x 2 )(l + x), 

X 2 ■ (1 + x) = 1 + x 2 = (1 + Jt)(l + x). 

Theorem 7.1.10 The rings Z, F ? [,t] and F ? [x]/(x n — 1) are all principal ideal 
rings. 

Proof. Let I be an ideal of Z. If I = {0}, then I =< 0 > is a principal ideal. 
Assume that I ^ {0} and let m be the smallest positive integer in I . Let a be 
any element of I . By the division algorithm, we have 

a = qm + r (7.2) 

for some integers q and 0 < r < m — 1. The equality (7.2) implies that r is 
also an element of I since r = a — qm. This forces r = 0 by the choice of m. 
Hence, I =< m >. This shows that Z is a principal ideal ring. 

Using the same arguments, we can easily show that F q [x] is also a principal 
ideal ring. 

Essentially the same method can be employed for the case F q [x\/{x n — 1). 
Since this case is crucial for this chapter, we repeat the arguments. The zero 
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ideal is obviously principal. We choose a nonzero polynomial g(x) of a nonzero 
ideal J with the lowest degree. For any polynomial / (x ) of J, we have 

f(x) = s(x)g(x) + r(x) 

for some polynomials s(x),r(x) € F ? [x] with deg(r(x)) < deg(g(x)). This 
forces r(x) = 0, since r (x ) = f(x) — s(x)g(x) e J and g(x) has the lowest 
degree among the nonzero polynomials of J. Hence, J =< g(x) >, and the 
desired result follows. □ 


7.2 Generator polynomials 

The reason for defining ideals in the preceding section is the following result 
connecting ideals and cyclic codes. 


Theorem 7.2.1 Let it be the linear map defined in (7.1). Then a nonempty 
subset C of F” is a cyclic code if and only ifn(C) is an ideal of F 9 [x]/(x n — 1). 

Proof. Suppose that jt(C) is an ideal of Y q [x]/(x n — 1). Then, for any a, f e 
F q c F ? [x]/(x" — l)anda, b e C, we have a n (a), fjt(b) e 7r(C)by Definition 
7. 1.5(ii). Thus by Definition 7.1.5(i), got (a) + finlh) is an element of : r(C); 
i.e., n(a a + fib) e tc(C), hence aa + /lb is a codeword of C. This shows that 
C is a linear code. 

Now let c = (co, ci, ... , c„_i) be a codeword of C. The polynomial 

7r(c) = C 0 + CiX H b C n - 2 X n ~ 2 + C„_ 1 x n ~ 1 

is an element of jt(C). Since tt(C ) is an ideal, the element 

xn( c) = cox + CiX 2 H b C„- 2 X n ~ X + c n - ix" 

= C„_1 + CqX + C\X 2 H 1- c n - 2 x n ~ l + c n —i(x n - 1) 

= c n - 1 + CqX + CxX 2 + • • • + C n - 2 X n 1 
(as x n - 1 =0m¥ q [xy(x n - 1)) 

is in 7 r(C); i.e., (c„_i, cq, ci, . . . , c „_ 2 ) is a codeword of C. This means that C 
is cyclic. 

Conversely, suppose that C is a cyclic code. Then it is clear that (i) of 
Definition 7.1.5 is satisfied for For any polynomial 

f(x) = fo + f\X H b f„- 2 x n ~ 2 + fn-lX n ~ l = 7t(fo, fi, . . . , f n -i) 
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of tt(C) with (/o, fi, , f„-i) € C, the polynomial 

xf(x) = fn—l + fox + fix 2 H f fn- 2 X n ~ l 

is also an element of n{C) since C is cyclic. Thus, x 2 fix) = x(xf(x)') is an 
element of tt(C). By induction, we know that x‘ fix) belongs to rc(C) for all 
i > 0. Since C is a linear code and it is a linear transformation, tc(C) is a 
linear space over F q . Hence, for any g(x) = go + gix + • • • + g n -\x n ~ l e 
^q\x]/(x n — 1), the polynomial 

n - 1 

g(x)f(x) = £>(*■/(*)) 

1=0 

is an element of tt(C). Therefore, n(C ) is an ideal of F 9 [x]/(x" — 1) since (ii) 
of Definition 7. 1 .5 is also satisfied. :» 

Example 7.2.2 (i) The code C = {(0, 0, 0), (1, 1, 1), (2, 2, 2)} is a ternary 
cyclic code. The corresponding ideal in F 3 [x]/(x 3 — 1) is ;r(C) = {0,1 + 
x +x 2 ,2 + 2x + 2x 2 }. 

(ii) The set I = {0, 1 + x 2 , x + x 3 , 1 + x + x 2 + x 3 } is an ideal in 
F 2 [x]/(x 4 — 1). The corresponding cyclic code is n~ l (I) = {0000, 1010, 

oioi, mi}. 

(iii) The trivial cyclic codes {0} and F" correspond to the trivial ideals {0} 
andF ? [x]/(x" — 1), respectively. 

Theorem 7.2.3 Let I be a nonzero ideal in F q [x\/{x n — 1) and let g(x) be a 
nonzero monic polynomial of the least degree in I . Then g(x) is a generator of 
/ and divides x" — 1 . 

Proof. For the first part, we refer to the proof of Theorem 7.1.10. 

Consider the division algorithm 

x n - 1 = s(x)g(x) + r(x) 
with deg(r(x)) < deg(g(x)). Hence, 

r(x) = lx" - 1) - s(jOs(jc) 

is an element of / (note that x" — 1 is the zero element of F ? [x]/(x" — 1)). 
This implies that r(x) = 0 since g(x) has the lowest degree. Hence, g(x) 
is a divisor of x n — 1. □ 

Example 7.2.4 In Example 7.2.2(i), the polynomial 1 + x + x 2 is of the least 
degree. It divides x 3 — 1. In Example 7.2.2(ii), the polynomial I + x 2 is of the 
least degree. It divides x 4 — 1. 

For the code F", the polynomial 1 is of the least degree. 
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By Theorem 7.1.10, we know that every ideal in ¥ q [x~\/{x n — 1) is principal, 
thus a cyclic code C is determined by any of the generators of n (C). Usually, 
there is more than one generator for an ideal of F ? [x]/(x" — 1). The following 
result shows that the generator satisfying certain additional properties is unique. 

Theorem 7.2.5 There is a unique monic polynomial of the least degree in every 
nonzero ideal / of ¥ q [x\/{x n — 1). {By Theorem 7.2.3, it is a generator of I .) 

Proof. Let gi{x), i — 1, 2, be two distinct monic generators of the least degree 
of the ideal I . Then, a suitable scalar multiple of giCr) — giQc) is a nonzero 
monic polynomial of smaller degree in I . It is a contradiction. □ 

From the above result, the following definition makes sense. 

Definition 7.2.6 The unique monic polynomial of the least degree of a nonzero 
ideal I of ¥ q \x]/{x n — 1) is called the generator polynomial of I . For a 
cyclic code C, the generator polynomial of 7r(C) is also called the generator 
polynomial of C . 

Example 7.2.7 (i) The generator polynomial of the cyclic code {000, 1 10, 
011, 101} is l+x. 

(ii) The generator polynomial of the simplex code in Example 7.1.3(iii) is 
l+x 2 + x 3 +x 4 . 

Theorem 7.2.8 Each monic divisor of x n — 1 is the generator polynomial of 
some cyclic code in F" . 

Proof. Let g(x) be a monic divisor of x" — 1 and let I be the ideal < g(x) > 
of ¥ q [x]/(x" — 1) generated by g(x). Let C be the corresponding cyclic code. 
Assume that h(x) is the generator polynomial of C. Then there exists a poly- 
nomial b(x) such that 

h(x) = g{x)b{x) (mod x n - 1). 

Thus, g(x) is a divisor of h(x). Hence, g(x) is the same as h(x) since h (x ) has 
the least degree and is monic. □ 

From Theorems 7.2.5 and 7.2.8, we obtain the following result. 

Corollary 7.2.9 There is a one-to-one correspondence between the cyclic 
codes in F" and the monic divisors ofx" — 1 e ¥ q [x]. 

Remark 7.2.10 The polynomials 1 and jc" — 1 correspond to V" and {0}, 
respectively. 
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Example 7.2.11 In order to find all binary cyclic codes of length 6, we factorize 
the polynomial* 6 — 1 e F 2 [*]: 

* 6 -l = (l+*) 2 (l+*+* 2 ) 2 . 

List all the monic divisors of x 6 — 1 : 

1 , l + x, l+x+x 2 , 

(1+*) 2 , (l+x)(l+*+* 2 ), (l+x) 2 (l+*+x 2 ), 

(l+*+* 2 ) 2 , (l+x)(l+*+* 2 ) 2 , 1+* 6 . 

Thus, there are nine binary cyclic codes of length 6 altogether. Based on the 
map tv, we can easily write down all these cyclic codes. For instance, the cyclic 
code corresponding to the polynomial (1 + * + x 2 ) 2 is 

{ 000000 , 101010 , 010101 , 111111 ). 

From the above example, we find that the number of cyclic codes of length n 
can be determined if we know the factorization of — 1 . We have the following 
result. 

Theorem 7.2.12 Let x n — 1 e F 9 [x] have the factorization 
x n - 1 = f] pf(x), 

i=i 

where p\(x), P 2 (x ), . . . , p r (x) are distinct monic irreducible polynomials and 
e, > 1 for all i = 1 , 2, .... r. Then there are W,= S e i + 1) cyclic codes of 
length n over F (/ . 

The proof of Theorem 7.2.12 follows from Corollary 7.2.9 by counting the 
number of monic divisors of — 1. 

Example 7.2.13 Using Theorem 3.4.11, we can factorize the polynomial 
x" — 1, and thus the number of cyclic codes of length n can be determined 
by Theorem 7.2.12. 

Tables 7. 1 and 7.2 show the factorization of — 1 and the number of <7 -ary 
cyclic codes of length n, for 1 < n < 10 and q = 2,3. 

Since a cyclic code is totally determined by its generator polynomial, all the 
parameters of the code are also determined by the generator polynomial. The 
following result gives the dimension in terms of the generator polynomial. 
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Table 7.1. Binary cyclic codes of length up to 10. 


n 

Factorization of x" — 1 

No. of cyclic codes 

1 

l+x 

2 

2 

(1+x) 2 

3 

3 

(1 +Jt)(l +x + x 2 ) 

4 

4 

( l+X ) 4 

5 

5 

(1 + x)(l +x+x 2 +x 3 + x 4 ) 

4 

6 

(1 + X) 2 (l + X + X 2 ) 2 

9 

7 

(1 +Jt)(l +X 2 +X 3 )(1 +X+X 3 ) 

8 

8 

(l+x? 

9 

9 

(1 + x)(\ + X + x 2 )(l + X 3 + X 6 ) 

8 

10 

(1 + X) 2 (\ + X + X 2 + X 3 + X 4 ) 2 

9 


Theorem 7.2.14 Let g(x) be the generator polynomial of an ideal of¥ q [x]/ 
(x n — 1). Then the corresponding cyclic code has dimension k if the degree of 
g(x) is n-k. 

Proof. For two polynomials cfx) f Ci(x) with deg(c,(x)) < k — \ (i = 1,2), 
we have clearly that g(x)c fx) ^ g(x)C 2 (x) (mod x n — 1). Hence, the set 

A := {g(x)c(; x) : c(x) e F, [*]/(*" - 1), deg(c(x)) < k - 1} 

has q k elements and is a subset of the ideal < g(x) >. On the other hand, for 
any codeword g(x)a(x) with a(x) e ¥ q [x]/(x n — 1), we write 

a(x)g(x) = u(x)(x n - 1) + v(x) (7.3) 

with deg(u(A:)) < n. By (7.3), we have that v(x) = a(x)g(x) — u(x)(x n — 1). 
Hence, g(x) divides v(x). Write v(x) = g(x)b(x) for some polynomial b(x). 
Then deg (b(x)) < k, so v(x) is in A. This shows that A is the same as < g(x) >. 
Hence, the dimension of the code is log 9 \A\ = k. □ 


Example 7.2.15 (i) Based on the factorization: x 1 — 1 = (1 + x)(\ + x 2 + 
„r 3 )(l + x + x 2 ) e F 2 [x], we know that there are only two binary [7, 3]-cyclic 
codes: 

< (l+x)(l+x 2 +x 3 ) > = {0000000,1110100,0111010,0011101, 
1001110,0100111, 1010011, 1101001} 
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Table 7.2. Ternary cyclic codes of length up to 10. 


n 

Factorization of x” — 1 

No. of 
cyclic codes 

i 

2 + x 

2 

2 

(2 + xXl+x) 

4 

3 

(2 + x) 3 

4 

4 

(2 + x)(l+x)(l+x 2 ) 

8 

5 

(2 + x)(l + x + x 2 + x 3 + x 4 ) 

4 

6 

(2+x) 3 (l+x) 3 

16 

7 

(2 + jc)(1 +x+x 2 + x 3 +x 4 + x 5 + x 6 ) 

4 

8 

(2 + x)(l + jc)( 1 + x 2 )(2 + x+x 2 ) 
(2 + 2x + x 2 ) 

32 

9 

(2 + x) 9 

10 

10 

(2 + x)(l + x)(l +x + x 2 + x 3 + x 4 ) 
(1 + 2x + x 2 + 2x 3 + x 4 ) 

16 


and 


< (1 -I- JC)(1 -I- JC + JC 3 ) > = {0000000, 1011100,0101110,0010111, 
1001011, 1100101, 1110010,0111001}. 

(ii) Based on the factorization: x 1 — 1 = (2 + x)(l + x + x 2 +x 3 +x 4 + 
x 5 + x 6 ) € F 3 [x], we do not have any ternary [7, 2]-cyclic codes. 


7.3 Generator and parity-check matrices 

In the previous section, we showed that a cyclic code is totally determined by its 
generator polynomial. Hence, such a code should also have generator matrices 
determined by this polynomial. Indeed, we have the following result. 


Theorem 7.3.1 Let g(x) = g 0 + g\x + • • • + g„-kX n k be the generator 
polynomial of a cyclic code C in FJJ with deg(g(;t)) = n — k. Then the matrix 



gn-k 0 0 0 • • 0 N 

gn-k 0 0 • • 0 


#0 gl ■ ■ ■ ■ gn-k ) 


is a generator matrix ofC ( note that we identify a vector with a polynomial). 
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Proof. It is sufficient to show that g(x), xg(x), , x k ~ l g{x) form a basis of 
C. It is clear that they are linearly independent over F ? . By Theorem 7.2.14, 
we know that dim(C) = k. The desired result follows. □ 


Example 7.3.2 Consider the binary [7, 4]-cyclic code with generator 
polynomial g(x) = 1 +x 2 + x 3 . Then this code has a generator matrix 


G = 


( g(x) \ 
xg(x) 
x 2 g(x) 
\x 3 g(x)J 


/I 0 1 10 0 ()\ 
0 10 110 0 
0 0 10 110 
\0 0 0 1 0 1 l) 


This generator matrix is not in standard form. If the fourth row is added to the 
second row and the sum of the last two rows is added to the first row, we form 
a generator matrix in standard form: 


( 1 0 0 0 1 0 1 \ 

0 10 0 111 

0 0 10 110 

0 0 0 1 0 1 1 / 


Thus, a parity-check matrix is easily obtained from G' by Algorithm 4.3. 


From the above example, we know that parity-check matrices of a cyclic 
code can be obtained from its generator matrices by performing elementary 
row operations. However, since the dual code of a cyclic code C is also cyclic, 
we should be able to find a parity-check matrix from the generator polynomial 
of the dual code. The question then is to find the generator polynomial of the 
dual code C 3 -. 


Definition 7.3.3 Let h(x) = Y^=o a i x ' be a polynomial of degree k (a* / 0) 
over ¥ q . Define the reciprocal polynomial h R (x) of h(x) by 

k 

h R (x) := x k h( l/x) = y^a k -iX l . 

i=0 

Remark 7.3.4 If h(x) is a divisor of x n — 1, then so is /? r ( x ). 

Example 7.3.5 (i) For the polynomial h(x) = I + 2x + 3x 5 +x 7 e F 5 [x], the 
reciprocal of h(x) is 

h R (x) = x^ll/x) 

= x 7 (l + 2(1 /x) + 3(1/jc) 5 + (l/x) 7 ) 

= 1 + 3x 2 + 2x^ + x 1 . 
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(ii) Consider the divisor h(x) — 1 + x + x 3 e F 2 [.v] of x 7 — 1. Then 
h^(x) = 1 + x 2 + x 3 is also a divisor of x 1 — 1. 


Example 7.3.6 Letg(x) = go+gix+g 2 x 2 +g 3 x 3 be the generator polynomial 
of a cyclic code C over F ? of length 4 and let h (x ) = (x 4 — 1 )/g(x). Put 
h(x) = h 0 +hix+h 2 x 2 +hi,x 3 . Then/t R (x) = (h 3 +h 2 x +h\x 2 +h 0 x 3 )/x 3 ~ k , 
where k = deg (h(x)). 

Consider the product 

0 = g(x)h(x) 

= (go + gix + g 2 x 2 + g 3 x 3 )(h 0 + h\x + h 2 x 2 + h 3 x 3 ) 

= goho + ( gohi + gih 0 )x + (goh 2 + gih\ + g 2 h 0 )x 2 + (g 0 h 3 
+ gih 2 + g 2 hi +g 3 h 0 )x 3 + (gih 3 +g 2 h 2 +g 3 hi)x 4 
+ (gih3 + g 3 h 2 )x 5 + g 3 h 3 x 6 
= (goho + gih 3 + gih 2 + g 3 h\) + (goh\ + gih 0 + g 2 h 3 

+ g 3 h 2 )x + (goh 2 + gihi + g 2 ho + g3h 3 )x 2 + (goh 3 + gih 2 
+ gih\ + g 3 ho)x 3 (mod x 4 - 1). 

Thus, the coefficient of each power of x at the last step of (7.4) must be zero. 

Put b = (h 3 , h 2 , h u h 0 ) e ¥ A q and g = (g 0 , gi,g 2 , g 3 ) e F^. Let g, be the 
vector obtained from g by cyclically shifting i positions. By looking at the 
coefficient of x 3 in (7.4), we obtain 

go • b = g • b = goh 3 + gih 2 + g 2 hi + g 3 h 0 = 0. 

By looking at the coefficients of the other powers of x in (7.4), we obtain 
g, • b = 0 for all i = 0, 1, 2, 3. Therefore, b is a codeword of C x since the set 
{go, gi, g 2 , g 3 } generates C by Theorem 7.3.1. 

By cyclically shifting the vector b = (h 3 ,h 2 ,hi, ho) by k + 1 positions, we 
obtain the vector corresponding to hn(x). This implies that hn(x) is a codeword 
as C x is also a cyclic code. 

Since deg(/i R (x )) = deg(h(x ))=k, the set {/i R (x), x h^(x ), . . . , x n - k ~ 1 h R (x)} 
is a basis of C x . Hence, C x is generated by /i R (x). Thus, the monic polynomial 
h^ '/t R (x) is the generator polynomial of C x (note that ho = h( 0) ^ 0 since 
hogo = -1). 


It is clear that the above example can be easily generalized to any length n. 


Theorem 7.3.7 Let g(x) be the generator polynomial of a q-ary [n, k]-cyclic 
codeC. Puth(x) = (x" — l )/g(x). Then h^ hn(x) is the generator polynomial 
of C x , where ho is the constant term ofh(x). 
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Proof. Let g(x) = YTi= o Si x ‘ ar| d let h(x) = J2"=o h,x' . Then 

n - 1 

h R (x) = (1 /x n ~ k ~ l )^2h„_i_ix l , 

i = 0 

where k = deg (h(x)). 

Consider the product 

0 a g(x)h(x) 

= (goho + gih„ i H h g„-ihi) + ( g 0 h i + gih 0 + h g„-ih 2 )x 

+ (goh 2 + glhi-\ h gn-ih^X 2 H h (goh„-l + glhn-2 

H 1- gn-iho)x n ~ 1 (mod x n - 1). 

Thus, the coefficient of each power of x in the last line of the above display 
must be zero. By looking at the coefficient of each power of x, we obtain g, • 
(h n -i, h n - 2 , ... ,h i, ho) = 0, for all i = 0, 1, . . . , n — 1, where g, is the vector 
obtained from (go. gt, . . . , g„-i) by cyclically shifting i positions. Therefore, 
(h n -i, h n - 2 , ... ,h i, ho) is a codeword of C x since {go, gi, . . . , g„_i} generates 
C by Theorem 7.3.1. 

By cyclically shifting the vector (h n - 1 , h n - 2 , ... ,h\, ho) by k + 1 positions, 
we obtain the vector corresponding to h R (x). This implies that h R (x) is a 
codeword as C L is also a cyclic code. 

Since deg(/iR(x )) = deg(/t(x ))=k, the set {h R (x), xh R (x ), . . . , x nr - k ~ l h R (x)} 
is a basis of C x . Hence, C x is generated by h R (x). Thus, the monic polynomial 
h 0 1 h r (x ) is the generator polynomial of C x . □ 

Definition 7.3.8 Let C be a q - ary cyclic code of length n. Put h(x) = (x" — 
l)/g(x). Then, l h R (x) is called the parity-check polynomial of C, where ho 

is the constant term of h(x). 

Corollary 7.3.9 Let C be a q -ary [n, k]-cyclic code with generator polynomial 


g(x). Put h(x) = (x" 
matrix 

- l)/g(jc). Let h(x) = 

h 0 + h l x+- 

• + h k x k . Then the 


' h R (x) ^ 
xh R {x) 


' h k h k - 1 
0 h k h k -\ 

■■ ho 0 
■■■ h 0 

0 0 • • o\ 
0 0-0 

H = 

K x n - k ~ l h R (x)j 


^0 0 

■ ■ h k h k - 1 

• ho/ 


is a parity-check matrix of C . 
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Proof. The result immediately follows from Theorems 7.3.1 and 7.3.7. 

Example 7.3.10 Let C be the binary [7, 4]-cyclic code generated by g(x) = 
l+x 2 + x 3 as in Example 7.3.2. Put/t(x) = (x 1 — 1 )/g(x) = \ +x 2 + x 3 +x A . 
Then /i R (x) = l + x + x 2 + x 4 is the parity-check polynomial of C. Hence, 

/I 1 1 0 1 0 0\ 

H = 0 1 1 1 0 1 0 

\0 0 1 1 1 0 1 / 

is a parity-check matrix of C. 


7.4 Decoding of cyclic codes 

The decoding of cyclic codes consists of the same three steps as the decoding of 
linear codes: computing the syndrome; finding the syndrome corresponding to 
the error pattern; and correcting the errors. Because of the pleasing structure of 
cyclic codes, the three steps for cyclic codes are usually simpler. Cyclic codes 
have considerable algebraic and geometric properties. If these properties are 
properly used, simplicity in the decoding can be easily achieved. 

From Corollary 7.3.9, for a cyclic code, we can easily produce a parity-check 
matrix of the form 


H = ( I„- k \A ) (7.5) 

by performing elementary row operations. Though parity-check matrices for 
a linear code are not unique, the parity-check matrix of the form in (7.5) is 
unique. All syndromes considered in this section are computed with respect to 
the parity-check matrix of the form in (7.5). 

Theorem 7.4.1 Let H = ( I„- k \A ) be a parity-check matrix of a q-ary cyclic 
code C. Let g(x) be the generator polynomial ofC. Then the syndrome of 
a vector w e F” is equal to (w(x) (mod g(x))); i.e., the principal remainder 
of uj(x) divided by g(x) (note that here we identify a vector of ¥" with a 
polynomial of ¥ q [x]/(x n — 1), and thus w(x) is the corresponding polynomial 
ofw). 

Proof. For each column vector of A, we associate a polynomial of degree at 
most n — k — 1 and write A as 


A = (a 0 (x), afx ), .... a k - i(x)). 
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By Algorithm 4.3, we know that G = (— A T |/*) is a generator matrix for C. 
Therefore, x n ~ k+l — a, (x ) is a codeword of C. Put x n ~ k+ ' — a* (x) = qi(x)g(x) 
for some qdx) e ¥ q \pc\/{x n — 1); i.e., 

afx) = x n ~ k+i - qi (x)g(x). (7.6) 

Suppose w(x) = ui() + u>\X + • • • + w n -\x n ~ l . For the syndrome s = w H T of 
w, the corresponding polynomial s(;t) is 


s(x) = w 0 + w\x H 1- w n - k -ix n k 1 + w„- k a 0 (x) H h w n -\a k -i{x) 

n-k-l k - 1 

= w > x ' + E Wn-k+j(x n - k+J - qj(x)g(x)) (by (7.6)) 


j = 0 
k - 1 


= ^2 w ' x> - ( ^2 W n-k+jqM) I g(x) 
i=0 \j = 0 

= w(x) (mod g(x)). 


As the polynomial s(x) has degree at most n — k — 1, the desired result 
follows. □ 


Example 7.4.2 Consider the binary [7, 4, 3]-Hamming code with the 
generator polynomial g(x) = 1 + x 2 + x 3 . Then, by performing elementary 
row operations from the matrix in Example 7.3.10, we obtain a parity-check 
matrix H = (h\A), where A is the matrix 


AllO ' 

A = 0 1 1 1 

<110 1 


For the word w = 0110110, the syndrome is s = w/7 T = 010. On the other 
hand, 


w(x) = x + x 2 + x 4 + x 5 = x + x 2 g(x). 


Thus, the remainder (w(x) (mod g(x))) is x, which corresponds to the word 

010 . 


Theorem 7.4.1 shows that the syndrome of a received word w(x) can be 
determined by the remainder s(x) = (w(x) (mod g{x))). Hence, w(x) — s(x) 
is a codeword. 


Corollary 7.4.3 Let g(x) be the generator polynomial of a cyclic code C. For 
a received word w(x), if the remainder six) ofuj(x) divided by g(x) has weight 



7.4 Decoding of cyclic codes 


147 


less than or equal to |_ (d(C) — 1)/2J , then s(x) is the error pattern ofw(x); i.e., 
w(x) is decoded to w(x) — s(x) by MLD. 

Proof. By Theorem 7.4.1, we know that w(x) and s(x) are in the same coset. 
Furthermore, s(x) is a coset leader by Exercise 4.44 since wt(,s(x)) < |_ (d(C) — 
1)/2J . The desired result follows. □ 

Example 7.4.4 As in Example 7.4.2, the remainder of w(x) = x+x 2 +x 4 +x 5 
divided by g(x) = 1 + x 2 + x 3 is x. Therefore, w(x) is decoded to w(x) — x = 
x 2 + x 4 + x 5 = 0010110. If the word wfx) = l + x 2 + x 3 + x 4 is received, 
then the remainder (w i(x) (mod g(x))) is 1 + x + x 2 . In this case, we can use 
syndromedecodingtoobtainthecodewordiciCx)— x 4 = l+x 2 +x 3 = 1011000 
as the word 0000100 is the coset leader for the coset in which w \ (x ) lies. 

From the above example, we see that, for some received words we can 
directly decode by throwing away the remainder from the words. However, for 
other words we have to use syndrome decoding. Because of the algebraic and 
geometric properties of cyclic codes, we can simplify the syndrome decoding 
for some received words. In the rest of this section, we will describe the so- 
called error trapping decoding. 

Lemma 7.4.5 Let C be a q-ary [n, k]-cyclic code with generator polynomial 
g(x'). Let s (x ) = £"T 0 S[X l be the syndrome ofw(x). Then the syndrome of 
the cyclic shift xw(x) is equal to xs(x) — | g{x). 

Proof. By Theorem 7.4.1, it is sufficient to show that xs(x) — s n -k-\g(x) is the 
remainder of xw(x) divided by g(x). Let w(x) = q{x)g{x) + s(x). Then 

xw(x) = xq(x)g(x) + xs(x) = (xq(x) + s n - k -i)g{x) + (xs(x) - s„- k -ig(x)). 

The desired result follows as deg(xs(x) — s„- k -\g(x)) < n — k = 
deg(^(x)). 

Remark 7.4.6 The syndrome of the cyclic shift x‘w(x) of a word w(x) can 
be computed through the syndrome of the cyclic shift x‘~ l w(x). Thus, the 
syndromes of w(x), xw(x), x 2 w(x ), . . . , can be computed inductively. 

Example 7.4.7 As in Example 7.4.2, the syndrome of w(x) = x+x 2 +x 4 +x 5 
is x, thus the syndromes of xw{x) and x 2 w(x) are x-x = x 2 and x ■ x 2 — g(x) = 
1 + x 2 , respectively. 

Definition 7.4.8 A cyclic run of 0 of length l of an n-tuple is a succession of / 
cyclically consecutive zero components. 
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Example 7.4.9 (i) e = (1, 3, 0, 0, 0, 0, 0, 1, 0) has a cyclic run of 0 of 
length 5. 

(ii) e = (0, 0, 1, 2, 0, 0, 0, 1, 0, 0) has a cyclic run of 0 of length 4. 

Decoding algorithm for cyclic codes 

Let C be a q- ary [n,k,d]- cyclic code with generator polynomial 
g(x). Let w(x) be a received word with an error pattern e(x), where 
wt(e(x)) < \{d — 1)/2J and e(x) has a cyclic run of 0 of length at least 
k. The goal is to determine e{x). 

Step 1: Compute the syndromes of x l w(x), for i = 0, 1, 2, . . and 
denote by s,(x) the syndrome (x 1 w(x) (mod g(x))). 

Step 2: Find m such that the weight of the syndrome s m (x) for x m ui(x) 
is less than or equal to | _(d — 1)/2J . 

Step 3: Compute the remainder e(x) of x n ~ m s m (x) divided by x n — 1. 
Decode w(x) to ui(x) — e(x). 

Proof. First of all, we show the existence of such an m in Step 2. By the 
assumption, there exists an error pattern e(x) such that e(x) has a cyclic run of 0 
of length at least k. Thus, there exists an integers > 0 such that the cyclic shift 
of the error pattern e(x) through m positions has all its nonzero components 
within the first n—k positions. The cyclic shift of the error pattern e(x) through 
m positions is in fact the remainder of (x m w(x) (mod x n — 1)) divided by g(x). 
Put 


r (x ) := ((x m w(x) (mod x n — 1)) (mod g(x))) = (x m w(x) (mod g(x))). 

The weight of r(x) is clearly the same as the weight of e(x), which is at most 
L (d — 1)/2J . This shows the existence of in. 

The word t(x) := (x"~ m s m (x) (mod x" — 1)) is a cyclic shift of (s m , 0 ) 
through n — m positions, where s m is the vector of F''~ /; corresponding to the 
polynomial s m {x). It is clear that the weight of t(x) is the same as the weight 
of s m (x). Hence, wt(t(x)) < \_(d - 1)/2J. As 

x m (w(x) - t(x)) = x m {w(x) - x n ~ m s m {x)) 

= x m w(x ) — x n s m (x) 

= s m (x) — x n s m {x) 

= (1 - X n )s m {x) 

= 0 (mod g(x)) 
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Table 7.3. 


i Si(x) 

0 

1 + x + x 2 

1 

l+x 

2 

x+x 2 

3 

1 


Table 7.4. 


J/(x) 


0 l+x 2 +x 3 +x 1 

1 l + x + x 3 + x 4 + x 1 

2 1 + x + x 2 + x 5 + x 6 + x 1 

3 l+x+x 2 + x 3 +x 4 

4 x + x 2 + x 3 + x 4 + x 5 

5 x 2 +x 3 +x 4 + x 5 +x 6 

6 x 3 + x 4 + x 5 + x 6 + X 1 

7 l+x 5 


and x m is co-prime to g(x) (see Remark 3.2.5(iii)), we claim that w(x) — t(x) 
is divisible by g(x)\ i.e., u>(x) — t(x) is a codeword. As t(x) and the 
error pattern e(x) are in the same coset, we have that e(x) — t(x) = 
(x" _m s m (x) (mod x n — 1)) by Exercise 4.44. 

Example 7.4.10 (i) As in Example 7.4.4, consider the received word 

wi(x) = 101 1 100 = 1 + x 2 + x 3 + x 4 . 

Compute the syndromes s,(x) of x'w\{x) until wt(s,(x)) < 1 (Table 7.3). 
Decode u>i(x) = 1011100 to wi(x) — x^s-six) = w\(x) — x 4 = l + x 2 +x 3 = 
1011000. 

(ii) Consider the binary [15, 7]-cyclic code generated by g(x) — I + x 4 + 
x 6 + x 7 + x 8 . We can check from the parity-check matrices that the minimum 
distance is 5. An error pattern with weight at most 2 must have a cyclic run 
of 0 of length at least 7. Thus, we can correct such an error pattern using the 
above algorithm. Consider the received word 

w(x) = 110011101100010= l+x + x 4 + x 5 +x 6 + x 8 +x 9 +x 13 . 
Compute the syndromes Sj (x ) of x‘ w(x) until wt(s,(x)) < 2 (Table 7.4). 
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Decode w(x) = 110011101100010 to w(x) - x s s 7 (x) = w(x) - x 8 - x 13 = 
1 + x + x 4 + x 5 + x 6 + x 9 = 1 1001 1 100100000. 


7.5 Burst-error-correcting codes 

So far, we have been concerned primarily with codes that correct random errors. 
However, there are certain communication channels, such as telephone lines 
and magnetic storage systems, which are affected by errors localized in short 
intervals rather than at random. Such an error is called a burst error. In general, 
codes for correcting random errors are not efficient for correcting burst errors. 
Therefore, it is desirable to construct codes specifically for correcting burst 
errors. Codes of this kind are called burst-error-correcting codes. 

Cyclic codes are very efficient for correcting burst errors. Many effective 
cyclic burst-error-correcting codes have been found since the late 1970s. In this 
section, we will discuss some properties of burst-error-correcting codes and a 
decoding algorithm. The codes in this section are all binary codes. 

Definition 7.5.1 A burst of length l > 1 is a binary vector whose nonzero 
components are confined to l cyclically consecutive positions, with the first and 
last positions being nonzero. 

A code is called an l -burst-error-correcting code if it can correct all burst 
errors of length / or less; i.e., error patterns that are bursts of length l or less. 

Example 7.5.2 001 1010000 is a burst of length 4, while 01000000000000100 
is a burst of length 5. 

Theorem 7.5.3 A linear code C is an l -burst-error-correcting code if and only 
if all the burst errors of length l or less lie in distinct cosets ofC. 

Proof. If all the burst errors of length / or less lie in distinct cosets, then each 
burst error is determined by its syndrome. The error can then be corrected 
through its syndrome. 

On the other hand, suppose that two distinct burst errors bi and b 2 of length 
/ or less lie in the same coset of C. The difference c = bi — b 2 is a codeword. 
Thus, if bi is received, then bi could be decoded to both 0 and c. □ 

Corollary 7.5.4 Let C bean [n, k]-linear l -burst-error-correcting code. Then 

(i) no nonzero burst of length 21 or less can be a codeword; 

(ii) (Reiger bound.) n — k >21. 
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Proof, (i) Suppose that there exists a codeword c which is a burst of length < 21 . 
Then, c is of the form (0, 1, u, v, 1, 0), where u and v are two words of length 
<1—1. Hence, the words w = (0, 1, u, 0, 0, 0) and c — w = (0, 0, 0, v, 1, 0) 
are two bursts of length </. They are in the same coset. This is a contradiction 
to Theorem 7.5.3. 

(ii) Let m, U 2 , . . . , u n -k+\ he the first n — k + 1 column vectors of a 
parity-check matrix of C. Then, they lie in ¥^ k and are hence linearly de- 
pendent. Thus, there exist ..., c„_*. + i e F 2 , not all zero, such that 

j-vi-i+i c . u . _ q This implies that (cj , c 2 , . . . , c„_£ + 1 , 0) is a codeword, and 
it is clear that this codeword is a burst of length < n — k + 1. By part (i), we 
have n — k + 1 > 21; i.e., n — k >21. □ 

An [«, £]-linear /-burst-error-correcting code satisfies n — k > 2/; i.e., 



A linear burst-error-correcting code achieving the above Reiger bound is called 
an optimal burst-error-correcting code. 


Example 7.5.5 Let C be the binary cyclic code of length 15 generated by 
1 +x + x 2 + x 3 + x 6 . It is a [15, 9]-linear code. The reader may check that all 
the bursts of length 3 or less lie in distinct cosets of C. By Theorem 7.5.3, C is 
a 3-burst-error-correcting code. The reader may also want to confirm Corollary 
7.5 .4(i) by checking that no burst errors of length 6 or less are codewords. This 
code is optimal as the Reiger bound (7.7) is achieved. 

Note that a burst of length / has a run of 0 of length n — 1. By Corollary 
7.5.4, we have k < n — 21 < n — l for an [rt, £]-linear /-burst-error-correcting 
code. This satisfies the requirement for the decoding algorithm in Section 7.4 to 
correct an error containing a cyclic run of at least k zeros. Hence, the algorithm 
can be directly employed to correct burst errors. The main difference is that, 
in the case of burst-error-correction, we do not require the weight of an error 
pattern to be less than or equal to [ (d(C) — 1)/2J. The modified decoding 
algorithm for burst-error-correction is as follows. 

Decoding algorithm for cyclic burst-error-correcting codes 

Let C be a q - ary [«, £]-cyclic code with generator polynomial g(x). 

Let w(x) be a received word with an error pattern e(x) that is a burst 
error of length l or less. 



152 


Cyclic codes 


Table 7.5. 


i Si(x) 

0 l+x+x 4 + x 5 

1 l + x 3 +x 5 

2 l+x 2 +x 3 +x 4 

3 x+x 3 + x 4 + : 

4 \ + x + x 3 + x 

5 1+x 3 +x 4 + 

6 l+x 2 +x 3 + 

7 l + x 2 +x 4 + 

8 l + x 2 + x 5 

9 1+x 2 


Table 7.6. 


Code parameters 

Generator polynomials 

[7,3] 

1+x + x 

T x 4 

[15,9] 

1 + x + x 

2 + x 3 + x 6 

[15,7] 

l+x 4 +. 

x 6 +x 7 +x 8 

[15,5] 

1 + x + x 

2 + x 4 + x 5 +x 8 +x 10 


Step 1: Compute the syndromes of x'w(x ) for i = 1, 2, . . and 
denote by s t (x) the syndrome of x‘ w(x). 

Step 2: Find m such that the syndrome for x m w(x) is a burst of length 
/ or less. 

Step 3: Compute the remainder e(x) of x n ~ m s m (x) divided by x n — 1 . 
Decode w(x) to w(x) — e(x). 

The proof of the algorithm is similar to the one in the previous section. Now 
we use the code in Example 7.5.5 to illustrate the above algorithm. 

Example 7.5.6 Consider the binary [15, 9]-cyclic code generated by g(x) = 
1+x +x 2 +„t 3 +x 6 . We can correct all burst errors of length 3 or less. Suppose 

w(x)= 111011101100000= l + x + x 2 + x 4 + x 5 + x 6 + x* + x 9 . 

Compute the syndromes Si(x) of x'w(x) until s m (x) is a burst of length 3 or 
less (Table 7.5). Decode w(x) = 111011101100000 to w(x) - x%(x) = 
w(x) -x 6 -x s = 1+x + x 2 +x 4 + x 5 +x 9 = 11101 1000100000. 
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We end our discussion with a list of a few optimal burst-error-correcting 
cyclic codes (see Table 7.6). 


Exercises 

7. 1 Which of the following codes are cyclic ones? 

(a) {(0, 0, 0), (1, 1,1), (2,2,2)} cF]; 

(b) {(0, 0, 0), (1, 0, 0), (0, 1, 0), (0, 0, 1)} c F 3 ; 

(c) {(* 0 , JCi x„_0 G F" : Ed *, = 0}; 

(d) {(*o, x u ..., Je„-i) G Fg : Ed x i = 0}; 

(e) {(x 0 , xu--., x„_i) G ¥ n 2 : Ed ( x ? + x 0 = °)- 

7.2 Show that the dual code of a cyclic code is cyclic. 

7.3 Show that the set I — {f(x) e F ? [x] : /( 0) = /( 1) = 0} is an ideal of 
F ? [x] and find a generator. 

7.4 Suppose that x, y are two independent variables. Show that the polyno- 
mial ring F 9 [x, y ] is not a principal ideal ring. 

7.5 Find all the possible monic generators for each of the following ideals: 

(a) / = < 1 + x + x 3 > c F 2 [x]/(x 7 - 1); 

(b) / = < 1 + x 2 > c F 3 [x]/(x 4 - 1). 

7.6 Determine whether the following polynomials are generator polynomials 
of cyclic codes of given lengths: 

(a) g(x) = 1 + x + x 2 + x 3 + x 4 for a binary cyclic code of length 7; 

(b) g(x) = 2 + 2x 2 + x 3 for a ternary cyclic code of length 8; 

(c) g(x) = 2 + 2x + x 3 for a ternary cyclic code of length 13. 

7.7 For each of the following cyclic codes, find the corresponding generator 
polynomial: 

(a) (Ml. 1 1) : ^f ? )cf; ; 

(b) {0000, 1010,0101, 1111} c F 4 ; 

(c) {(x„, Xi, ..., x„_!) G ¥ n q : Ed = 0}; 

(d) {(x 0 , Xi, . . . , x„_0 G F” : Ed x f = °}- 

7.8 Determine the smallest length for a binary cyclic code for which each of 
the following polynomials is the generator polynomial: 

(a) g(x)= l+x 4 +x 5 ; 

(b) g(x)= l+x + x 2 + x 4 +x 6 . 

7.9 Based on Example 3.4. 13(ii), determine the following: 

(a) the number of binary cyclic codes of length 21; 

(b) all values k for which there exists a binary [21, A:] -cyclic code; 

(c) the number of binary [21, 12]-cyclic codes; 

(d) the generator polynomial for each of the binary [21, 12]-cyclic codes. 
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7.10 Based on Example 3.4. 13(i), determine the following: 

(a) the number of ternary cyclic codes of length 13; 

(b) all values k for which there exists a ternary [13, £]-cyclic code; 

(c) the number of ternary [13, 7]-cyclic codes; 

(d) the generator polynomial for each of the ternary [13, 7]-cyclic codes. 

7.11 Construct the generator polynomials of all binary cyclic codes of 
length 15. 

7.12 Let g(x ) = (1 + x)(l + x + x 3 ) e F 2 [x] be the generator polynomial 
of a binary [7, 3]-cyclic code C. Write down a generator matrix and 
a parity-check matrix for C. Construct a generator matrix of the form 

(/si A). 

7.13 Let g(x) = I + x 4 + x 6 + x 7 + x 8 e F 2 [x] be the generator polynomial 
of a binary [15, 7]-cyclic code C. Write down a generator matrix and 
a parity-check matrix for C. Construct a generator matrix of the form 

(/ 7 |A). 

7. 14 Suppose a generator (or parity-check, respectively) matrix of a linear code 
C has the property that the cyclic shift of every row is still a codeword 
(or a codeword in the dual code, respectively). Show that C is a cyclic 
code. 

7.15 Let gi(x), gi(x) be the generator polynomials of the q-ary cyclic codes 
Ci, C 2 of the same length, respectively. Show that Ci c C 2 if and only 
if gi(x) is divisible by giix). 

7.16 Let v G FJ. Show that the generator polynomial of the smallest cyclic 
code containing v is equal to gcd(t>(x), x n — 1), where v(x) is the poly- 
nomial corresponding to v. 

7.17 Determine the generator polynomial and the dimension of the smallest 
cyclic code containing each of the following words, respectively: 

(a) 1000111 eF^; 

(b) (1, 0, 2, 0, 2, 0, 1, 1) G F3; 

(c) 101010111110010 eF’ 5 . 

7. 18 Let g(;c) be the generator polynomial of a <7 -ary cyclic code C of length n. 
Put h(x) = (x n — \)/ g(x). Show that, if a(x) is a polynomial satisfying 
gcd(a(x), h{x)) = 1, then a(x)g(x) is a generator of C. Conversely, if 
g i (x) is a generator of C, then there exists a polynomial a(x) satisfying 
gcd(a(x), h(x)) = 1 such that gi(x) = a(x)g(x) (mod x n — 1). 

7.19 (a) Show that, for any 1 < k < 26, there exists a ternary cyclic code of 

length 27 and dimension k. 

(b) Based on the factorization of* 15 — 1 e F 2 [x], show that, for any 
1 < k < 15, there exists a binary cyclic code of length 15 and 
dimension k. 
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7.20 Let a be a primitive element of F 2 .« and let g(x) e F 2 [x] be the mini- 
mal polynomial of a with respect to F 2 . Show that the cyclic code of 
length 2 m — 1 with g(x) as the generator polynomial is in fact a binary 
[2 m — 1 , 2 m — 1 — m, 3]-Hamming code. 

7.21 Let C be a binary cyclic code of length n > 3 with generator polynomial 
g(x) ^ 1 , where n is the smallest positive integer for which x n — 1 is 
divisible by g(x). Show that C has minimum distance at least 3. Is the 
result true for nonbinary cyclic codes? 

7.22 Let g(x) be the generator polynomial of a binary cyclic code C . Show that 
the subset Ce of even-weight vectors in C is also a cyclic code. Determine 
the generator polynomial of Ce in terms of gOr). 

7.23 LetC, be a q -ary cyclic code of length n with generator polynomial gt(x), 
for i = 1 , 2 . 

(i) Show that C 1 fl C 2 and Ci + C 2 are both cyclic codes. 

(ii) Determine the generator polynomials of C 1 fl C 2 and C 1 + C 2 in terms 
of gi(x), g 2 (x). 

7.24 A codeword e(x ) of a 4 -ary cyclic code C of length n is called an 
idempotent if e 2 (x) = e(x) (mod x" — 1). If an idempotent e(x) is also 
a generator of C, it is called a generating idempotent. Let g(x) be the 
generator polynomial of a q- ary cyclic code C and put h(x) = (x n — 
1 )/g(x). Show that, if gcd(g(x), h(x)) = 1, then C has a unique gener- 
ating idempotent. In particular, show that, if gcd(n, q) = 1, then there 
always exists a unique generating idempotent for a q- ary cyclic code of 
length n. 

7.25 Find the generating idempotent for each of the following cyclic codes: 

(a) the binary [7, 4]-Hamming code Ham(3, 2); 

(b) the binary [15, 11, 3]-Hamming code Ham(4, 2); 

(c) the ternary [13, 10, 3]-Hamming code Ham(3, 3). 

7.26 Let Cj be a q -ary cyclic code of length n with generating idempotent e,- (x) 
(i = 1, 2). Show that Ci fl C 2 and Ci + C 2 have generating idempotents 
ei(x)e 2 (x) and e \ (x ) + e 2 (x) — e\ (x)e 2 (x), respectively. 

7.27 An error pattern e of a code C is said to be detectable if e + c ^ C for all 
c e C. Show that, for a cyclic code, if an error pattern e(x) is detectable, 
then its z'th cyclic shift is also detectable, for any i. 

7.28 LetC beabinary [7, 4]-Hammingcode with generator polynomial g(x) = 
1 + x + x 3 . Suppose each of the following received words has at most 
one error. Decode these words using error trapping: 

(a) 1101011 ; 

(b) 0101111 ; 

(c) 0100011 . 
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7.29 A binary [15, 7]-cyclic code is generated by g(x) = I +x 4 +x 6 +x 7 + x 8 . 
Decode the following received words using error trapping: 

(a) 110111101110110; 

(b) 111110100001000. 

7.30 Abinary [15, 5]-cyclic code is generated by g(x) = l+x+x 2 +x 4 +x 5 + 
x s + x 10 . Construct a parity-check matrix of the form (7io| A). Decode 
the following words using error trapping: 

(a) 011111110101000; 

(b) 100101111011100. 

7.31 Let C be the binary cyclic code of length 15 generated by g(x) = 1 + 
x 2 + x 4 + x 5 . 

(i) Find the minimum distance of C. 

(ii) Show that C can correct all bursts of length 2 or less. 

(iii) Decode the following received words using burst-error-correction: 
(a) 010110000000010; (b) 110000111010011. 

7.32 Let C be the binary [15, 9]-cyclic code generated by g(x) — I + x 3 + 
x 4 + x 5 + x 6 . Decode the following received words using burst-error- 
correction: 

(a) 101011101011100; 

(b) 010000001011111. 

7.33 Let a be a primitive element of F 2 » (m > 2) and let g(x) e F 2 M be its 
minimal polynomial with respect to F 2 . Let C be the binary cyclic code 
of length 2 m — 1 generated by (x + 1 )g(x). An error pattern of the form 

e(x) = x' + x i+1 

is called a double-adjacent-error pattern. Show that no double-adjacent- 
error patterns can be in the same coset of C. Thus, C can correct all the 
single-error patterns and double-adjacent-error patterns. 

7.34 Let g i(jc), gi(x) be two polynomials over F ? . Let «,■ be the length of the 
shortest cyclic code that gi(x) generates, for i — 1,2. Determine the 
length of the shortest cyclic code that ,? i (x)g2(x) generates. 

7.35 Let C be a binary cyclic code with generator polynomial g(x). 

(i) Prove that, if g(x) is divisible by v — 1, then all the codewords have 
even weight. 

(ii) Suppose the length of C is odd. Show that the all-one vector is a 
codeword if and only if g(x) is not divisible by x — 1. 

(iii) Suppose the length of C is odd. Show that C contains a codeword 
of odd weight if and only if the all-one vector is a codeword. 



Exercises 


157 


7.36 Let g(x) be the generator polynomial of a <7 -ary [n, £]-cyclic code with 
gcd(« , q) = 1. Show that the all-one vector is a codeword if and only if 
g(x) is not divisible by x — 1. 

7.37 Let C be a q- ary [q + \ , 2] -linear code with minimum distance q. Show 
that, if q is odd, then C is not a cyclic code. 

7.38 Let n be a positive integer and gcd(« , q) = 1. Assume that there are 
exactly t elements in a complete set of representatives of cyclotomic 
cosets of q modulo n. Show that there is a total of 2' cyclic codes of 
length n over F ? . 

7.39 Let a e F* . A q -ary linear code C is called constacyclic with respect to a 
if (ac n - 1 , Co, . . . , c„_ 2) belongs to C whenever (co, c\ , . . . , c„_ 1) belongs 
to C. In particular, C is called negacyclic if a = —1. 

(a) Show that a q- ary linear code C of length n is constacyclic with 
respect to a if and only if rr a (C) is an ideal of ¥ q [x]/(x n — a), where 
7t a is the map defined by 

n-l 

->• F q [x]/(x n -a), (c 0 ,ci,...,c„_i)i-+ J2 CiX ‘- 

i= 0 

(b) Determine all the ternary negacyclic codes of length 8. 

7.40 Suppose x n + 1 has the factorization over F 9 

i= 1 

where c, > 1 and pi(x) are distinct monic irreducible polynomials. Find 
the number of q- ary negacyclic codes. 

7.41 Let n be a positive integer with gcd(«, q) = 1. Let a be a primitive «th 
root of unity in some extension field of F 9 . Let g(x) e F 9 [x] be the 
minimal polynomial of a with respect to ¥ q . Assume that the degree of 
g(x) is m. LetC be the <7 -ary cyclic code with generator polynomial g(x). 
Then the dual code C 1 is called an irreducible cyclic code. Show that C L 
is the trace code Trp ™/f,(F), where V is the one-dimensional F^™ -vector 
space < (1, a, a 2 , . . . , a" -1 ) > . 

7.42 Let n be a positive integer and let 1 < i < n be a divisor of/?. A linear 
code C over ¥ q is quasi-cyclic of index t (or l-quasi-cyclic) if 

(c„_c, c„„ f+1 , . . . , c„_i, c 0 , ci, ... , c n -e- 1) e C 

whenever (co, ci, . . . , c„_i) e C. In particular, a l-quasi-cyclic code is a 
cyclic code. 

(a) Show that the dual of an f-quasi-cyclic code is again f-quasi-cyclic. 
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(b) For every positive integer m, show that there exist self-dual 2-quasi- 
cyclic codes over F ? of length 2m if q satisfies one of the following 
conditions: 

(i) q is a power of 2; 

(ii) q = p b , where p is a prime congruent to 1 (mod 4); 

(iii) q = p 2h . where p is a prime congruent to 3 (mod 4). 

7.43 Assume that q > 3 is a power of an odd prime. Let C \ , C2 be two linear 
codes over Y q of length n. 

(i) Using notation as in Exercise 6.6, show that C \ 0 C2 is a quasi-cyclic 
code of index n. 

(ii) Show that every quasi-cyclic code over F 9 of length 2 n of index n is 
of the form C\ 0 C2 for some suitably chosen linear codes C\ and 
C2 over F ? . 

7.44 For a prime power q > 2, let 1 < m < q — 1 be a divisor of q — 1 and let 
a e F* be an element of order m. Let Co, Ci, . . . , C m _i be linear codes 
over F ? of length l. Show that 

C := |(x 0 ,xi,...,x m _i) : x ; =^o: ,7 c y , 
l i = 0 

where c ; e C 7 for 0 < j < m — 1 J 

is a quasi-cyclic code of length Im of index i. (Note: the code C is called 
the Vandermonde product of Co, Ci, , C m _i .) 

7.45 (a) Let q be an even prime power and let C \ , C2 be linear codes over F ? 

of length n. Show that (see Exercise 6.7) 

C := {(a + x, b + x, a + b + x) : a, b e Ci, x e C 2 } 

is an n -quasi-cyclic code over F ? of length 3 n. 

(b) Let q be a power of an odd prime such that — 1 is not a square in F ? . 
Let i be an element of V q i such that i 2 + 1 =0. Let Tr denote the 
trace Trp 2 /f, defined in Exercise 4.5. Let C \ , C2 be linear codes over 
Y q of length l and let C3 be a linear code of length l over ¥ q i . Show 
that 

C := {(c 0 ,ci,c 2 ,c 3 ) : cj = x + (-l) 7 y 

+ Tr(zi 7 ), x € Ci, y € C2, z € C3} 
is an f-quasi-cyclic code over Y q of length 41. 
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The preceding chapter covered the subject of general cyclic codes. The struc- 
ture of cyclic codes was analyzed, and two simple decoding algorithms were 
introduced. In particular, we showed that a cyclic code is totally determined by 
its generator polynomial. However, in general it is difficult to obtain informa- 
tion on the minimum distance of a cyclic code from its generator polynomial, 
even though the former is completely determined by the latter. On the other 
hand, if we choose some special generator polynomials properly, then infor- 
mation on the minimum distance can be gained, and also simpler decoding 
algorithms could apply. In this chapter, by carefully choosing the generator 
polynomials, we obtain several important classes of cyclic codes, such as BCH 
codes, Reed-Solomon codes and quadratic -residue codes. In addition to then- 
structures, we also discuss a decoding algorithm for BCH codes. 


8.1 BCH codes 

The class of Bose, Chaudhuri and Hocquenghem (BCH) codes is, in fact, 
a generalization of the Hamming codes for multiple-error correction (recall 
that Hamming codes correct only one error). Binary BCH codes were first 
discovered by A. Hocquenghem [8] in 1959 and independently by R. C. Bose 
and D. K. Ray-Chaudhuri [1] in 1960. Generalizations of the binary BCH codes 
to q - ary codes were obtained by D. Gorenstein and N. Zierler [5] in 1961. 


8.1.1 Definitions 

We defined the least common multiple lcm(/i (x), fiix)) of two nonzero poly- 
nomials fi(x), fiix) e F ? [x] to be the monic polynomial of the lowest degree 
which is a multiple of both fi(x) and fi(x) (see Chapter 3). Suppose we have t 
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nonzero polynomials f (x), / 2 (x), . . . , ft(x) e F 9 [x]. The least common mul- 
tiple of fi(x fix) is the monic polynomial of the lowest degree which is 
a multiple of all of fix ), . . . , fix), denoted by lcm(/i (x), . . . , fix)). 

It can be proved that the least common multiple of the polynomials f (x), 
fix), fix) is the same as lcm(lcm(/i (x), fix)), fix)) (see Exercise 8.2). 

By induction, one can prove that the least common multiple of the polynomi- 
als fix), fix ),. . . , f,(x) is the same as lcm(lcm(/i (x), . . . , f t -\(x)), f{x)). 

Remark 8.1.1 If /i(x ), . . . , fix) e F ? [x] have the following factorizations: 

/l(x) = • Pl(x) ei1 ■ ■ ■ Pn(.x) ein , ..., f{x) = a, ■ Pl(x) e '' ■ ■ ■ Pn<,x) e,n , 

where a\,...,a t e F*, > 0 and Pi(x) are distinct monic irreducible poly- 

nomials over F q , then 

lcm(/i(jr), . . . , /,(*)) = pfx) mm{e '' • • • p n wr* e '»'"-'«*\ 

Example 8.1.2 Consider the binary polynomials f\{x) = (I +x) 2 (l +x+x 4 ) 3 , 
fix) = (1 + x)(l + x + x 2 ) 2 , fix) = x 2 (l + x + x 4 ). Then we have, by the 
above remark, that 

lcm(/i(jt), fix), fix)) = x 2 (l + x ) 2 (1 + x + x 2 ) 2 (l + x + x 4 ) 3 . 

Lemma 8.1.3 Let fix), f (x), fix), . . . , fix) be polynomials over F ? . If 
fix) is divisible by every polynomial fix) for i = 1 , 2 ,...,?, then fix) is 
divisible by lcm(/i (x), fix ), . . . , fix)) as well. 

Proof. Put gix) = lcm(/|(x), fix), . . . , fix)). By the division algorithm, 
there exist two polynomials uix) and r(x) over F 9 such that deg(r(x)) < 
deg(g(x)) and 

fix) = uix)gix) + rix). 

Thus, rix) = fix) — uix)gix), and therefore r(x) is also divisible by all fix). 
Since gix) has the smallest degree, this forces rix) = 0. □ 

Example 8.1.4 The polynomial fix) = x 15 — I e F 2 [x] is divisible by 
fix) = 1 + x + x 2 e F 2 [x], f 2 ix) = 1 + x + x 4 e F 2 [jc], and fix) = 
(1 + x + x 2 )(l + x 3 + x 4 ) <= F 2 [x], respectively. Then fix) is also divisible 
by lcm( fix), fix), fix)) #(1 + x + x 2 )(l + x + x 4 )(l + x 3 + x 4 ). 

Example 8.1.5 Fix a primitive element a of ¥ q m and denote by M il) ix) the 
minimal polynomial of a 1 with respect to F q . By Theorem 3.4.8, each root 
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ft of M il) (x) is an element of F ? », and therefore satisfies p qm ~ l — 1 = 0; 
i.e., x — fi is a linear divisor of x qm ~ l — 1. By Theorem 3.4.8 again, M {,) (x) 
has no multiple roots. Hence, M (, \x ) is a divisor of x qm ~ l — 1. For a subset I 

of Tj q tn i , the least common multiple \cm(M {,) (x))i e j is a divisor of x qm ~ l — 1 

as well by Lemma 8.1.3. 

The above example provides a method to find some divisors of x q '" ' 1 — 1. 
These divisors can be chosen as generator polynomials of cyclic codes of length 

q m ~ I- 

Definition 8.1.6 Let a be a primitive element of F q m and denote by M {,) (x) 
the minimal polynomial of a 1 with respect to F q . A ( primitive ) BCH code 
over F ? of length n = q m — 1 with designed distance <5 is a -ary cyclic 
code generated by g(x) := lcm (M (a \x), M <a+I> (x ), . . . , M >a+S ^ 2> (x)) for some 
integer a. Furthermore, the code is called narrow-sense if a = 1. 

Example 8.1.7 (i) Let a be a primitive element of F 2 * . Then a narrow-sense 
binary BCH code with designed distance 2 is a cyclic code generated by M l l, (x). 
It is in fact a Hamming code (see Exercise 7.20). 

(ii) Let a e Fs be a root of 1 + x + x 3 . Then it is a primitive element of Fg . 
The polynomials M <V) (x) and M (2 \x) are both equal to 1 + x + x 3 . 
Hence, a narrow-sense binary BCH code of length 7 generated by 
lcm (M m (x), M a> (x)) = 1 + x + x 3 is a [7, 4]-code. In fact, it is a binary 
[7, 4, 3]-Hamming code (see Exercise 7.20). 

(iii) With a as in (ii), a binary BCH code of length 7 generated by 
lcm(M (0) (x), M (l \x), M (1 \x)) = lcm(l + x,l + x + x 3 ) = (1 + x) 
(1 + x + x 3 ) is a [7, 3]-cyclic code. It is easy to verify that this code is the dual 
code of the Hamming code of (ii). 

Example 8.1.8 Let be a root ofl+x+x 2 e F 2 [x], then F 4 = F 2 [/)]. Let a 
be a root of fi + x+x 2 e F 4 [x]. Then a is a primitive element of Fig. Consider 
the narrow-sense 4-ary BCH code of length 15 with designed distance 4. Then 
the generator polynomial is 

g(x) = lcm (M (1) (x), M (2 ) (x), M°\x)) = \ + fix + fix 2 +x 3 +x A + p 2 x 5 +x 6 . 


8.1.2 Parameters of BCH codes 

The length of a BCH code is clearly q m — 1. We consider the dimension of 
BCH codes first. 



162 


Some special cyclic codes 


Theorem 8.1.9 (i) The dimension of a q-ary BCH code of length q m — I gen- 
erated by g(x ) := \cm(M ia) (x), M (a+l Hx ), . . . , M (a+S ~ 2 \x)) is independent of 
the choice of the primitive element a. 

(ii) A q-ary BCH code of length q m — 1 with designed distance 8 has dimen- 
sion at least q m — 1 — m(8 — 1). 


Proof, (i) Let C, be the cyclotomic coset of q modulo q m — 1 containing i. Put 
S = USE C,. By Theorem 3.4.8 and Remark 8.1.1, we have 

g(x) = 1cm ( Y\(x - a'), ]""[ (x - a'), .... ]~[ (x - a')) = ]~ [(a — of')- 

\ieCa i eC a+ i ieC a +s-i ) itS 

Hence, the dimension is equal to q m — 1 — deg(g(x)) = q m — 1 — |S|. As the 
set S is independent of the choice of a, the desired result follows. 

(ii) By part (i), the dimension k satisfies 


k = q m 

= q m 

> q m 

> q m 

= <i m 


i-15| 

a+S-2 


1 - 


i- E i^i 


i- E m 

1 -m{8- 1). 


(by Remark 3.4.6(i)) 


This completes the proof. 


□ 


The above result shows that, in order to find the dimension of a q- ary BCH 
code of length q m — 1 generated by g(x) := lcm(M (a) (x), M <a+l> (x ), . . . , 
fyjta+s- 2)( JC )) ( j s su ffl c i en t to check the cardinality of USE C,-. where C, is 
the cyclotomic coset of q modulo q m — 1 containing i . 


Example 8.1.10 (i) Consider the following cyclotomic cosets of 2 modulo 15: 
C 2 = {1, 2, 4, 8}, C 3 = {3, 6, 12, 9}. 

Then the dimension of the binary BCH code of length 15 of designed distance 
3 generated by g(x) := \cm(M a> (x), M 0> (x)) is 

15- |C 2 UC 3 | = 15-8 = 7. 

Note that the lower bound in Theorem 8.1.9(ii) is attained for this example. 
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Table 8.1. 


n 

k 

t 

n 

k 

t 

7 

4 

1 

63 

51 

2 

15 

11 

1 

63 

45 

3 

15 

7 

2 

63 

39 

4 

15 

5 

3 

63 

36 

5 

31 

26 

1 

63 

30 

6 

31 

21 

2 

63 

24 

7 

31 

16 

3 

63 

18 

10 

31 

11 

5 

63 

16 

11 

31 

6 

7 

63 

10 

13 

63 

57 

1 

63 

7 

15 


(ii) Consider the following cyclotomic cosets of 3 modulo 26: 

Ci =C 3 = {1,3,9}, C 2 = {2, 6, 18}, C 4 = { 4,12,10}. 

Then the dimension of the ternary BCH code of length 26 and designed distance 
5 generated by 

g(x) := lcm(M (I) (x), M°\x), M w (x )) 

is 


26 - |Ci U C 2 U C 3 U C 4 | = 26 - 9 = 17. 

Note that, for this example, the dimension is strictly bigger than the lower bound 
in Theorem 8.1.9(ii). 

Example 8.1.11 (i) For t > 1, t and 2 1 belong to the same cyclotomic coset 
of 2 modulo 2 m — 1. This is equivalent to the fact that M (l) (x) — M (1, Hx). 
Therefore, 

lcm(M (1) (x), . . . , M (2t ~ l) (*)) = lcm(M (1) (x), . . . , M (1, \x )); 

i.e., the narrow-sense binary BCH codes of length 2 m — 1 with designed distance 
2f + 1 are the same as the narrow-sense binary BCH codes of length 2'” — I 
with designed distance 2 1. 

In Table 8.1 we list the dimensions of narrow-sense binary BCH codes of 
length 2 m — 1 with designed distance 2t + 1, for 3 < m < 6. Note that the 
dimension of a narrow-sense BCH code is independent of the choice of the 
primitive elements (see Theorem 8.1.9(i)). 

(ii) Let a be a root of 1 + x + x 4 e F 2 U]. Then a is a primitive element 
of Fig. By Example 3.4.7(i) and Theorem 3.4.8, we can compute the minimal 
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Table 8.2. 


n 

k t 

Generator polynomial 

15 

11 1 

l+x+x 4 

15 

7 2 

(l+x+x 4 )(l+x+x 2 +x 3 +x 4 ) 

15 

5 3 

{\+x+x 4 )(l+x+x 2 + x' i + x 4 ){l+x+x 2 ) 


polynomials 

M (0 \x) = l+x, 

M (1) (jc) = M ( 2 ) (x) = M ( 4 ) (x) = M$\x) = l+x+x 4 , 

M (3 ) (x) = M ( 6 ) (x) = M ( 12) (;c) = M (9) (x) = 1 + x + x 2 + a: 3 + x 4 , 
M (5 \x) = M (10 \x) = l+x + x 2 , 

M a \x) = M (U \x) = M (13 ) (; c) = M (n fx) = 1 +;t 3 +;c 4 . 

The generator polynomials of the narrow-sense binary BCH codes of length 15 
in Table 8.1 are given in Table 8.2. 

Example 8.1.10(h) shows that the lower bound in Theorem 8.1.9(h) can be 
improved in some cases. The following result gives a sufficient condition under 
which the lower bound in Theorem 8.1.9(h) can be achieved. 


Proposition 8.1.12 A narrow-sense q-ary BCH code of length q m — 1 with 
designed distance 8 has dimension exactly q m — 1 — m{8 — 1 ) if q f=- 2 and 
gcd{q m — 1, e) = l for all 1 < e < 8 — 1. 


Proof. From the proof of Theorem 8.1.9, we know that the dimension is equal 
to 


U c ‘ 


where C, stands for the cyclotomic coset of q modulo q m — I containing i. 
Hence, it is sufficient to prove that |C, | = m for all I < i < 8 — 1, and that C, 
and C j are disjoint for all 1 < i < j < 8 — 1. 

For any integer 1 < t < m — 1, we claim that i ^ q'i (mod q m — 1) for 
1 < i < & — 1. Otherwise, we would have (q 1 — I )/ = 0 (mod q m — 1). This 
forces (q l — 1) = 0 (mod — I ) as gcd(/, q m — 1 ) = 1. This is a contradiction. 
This implies that |Cj| = m for all 1 < i < 8 — 1. 

For any integers 1 < i < j < 8— 1 , we claim that j ^ q s i (mod <7™ — l)for 
any integer s > 0. Otherwise, we would have j — i = (q s — 1 )i (mod q m — 1). 
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This forces j — i =0 (mod q — 1), which is a contradiction to the condition 
gcd(y — i, q m — 1) = 1. Hence, C, and C ; are disjoint. □ 

Example 8.1.13 Consider a narrow-sense 4-ary BCH code of length 63 with 
designed distance 3. Its dimension is 63 — 3(3 — 1) = 57. 

As we know that a narrow-sense binary BCH code with designed distance 2 1 
is the same as a narrow-sense binary BCH code with designed distance 2t + 1, 
it is enough to consider narrow-sense binary BCH codes with odd designed 
distance. 

Proposition 8.1.14 A narrow-sense binary BCH code of length n = 2 m — 1 
and designed distance 8 = 2t + 1 has dimension at least n — m(8 — 1 )/2. 

Proof. As the cyclotomic cosets C 2 , and C, are the same, the dimension k 
satisfies 

2 1 

k = 2 m - 1 - (Jc, 

/ = 1 

= 2 m - 1- (Jc 2 , , 

i= 1 

> 2 m -l-^|C 2i _ 1 | 

i = 1 

> 2 m - 1 - tm 

= 2 m - 1 - m(8 - l)/2. 

,'p, 

Example 8.1.15 A narrow-sense binary BCH code of length 63 with designed 
distance 6 = 5 has dimension exactly 51 = 63 — 6(5 — l)/2. However, a 
narrow-sense binary BCH code of length 31 with designed distance 6=11 has 
dimension 11, which is bigger than 31 — 5(11 — l)/2. 

For the rest of this subsection, we study the minimum distance of BCH 
codes. 

Lemma 8.1.16 Let C be aq-ary cyclic code of length n with generator polyno- 
mial g(x). Suppose ai, ... ,a r are all the roots ofg(x) and the polynomial g(x) 
has no multiple roots. Then an element c(x) of F 9 [x]/(x" — 1) is a codeword 
ofC if and only if c(cii) = 0 for all i = 1, . . . , r . 
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Proof. If c(x) is a codeword of C, then there exists a polynomial f(x) such that 
c(x) = g(x)f(x). Thus, we have c(a ,) = g(aj)f(aj) = 0 for all i = 1, .... r . 

Conversely, if c(a, ) = 0 for all i = then c(x) is divisible 

by g(x) since g(x) has no multiple roots. This means that c{x) is a codeword 
of C. □ 

Example 8.1.17 Consider the binary [7, 4]-Hamming code with generator 
polynomial g(x) = 1 + x + x 3 . As all the elements of Fg\{0, 1} are roots 
of c(x) — 1+ x + x 2 + x 3 + x 4 + x 5 + x 6 = (x 7 — \)/{x — 1), all the roots of 
g(x) are roots of c(x) as well. Thus, 1111111 is a codeword. 

The following theorem explains the term ‘designed distance’. 


Theorem 8.1.18 A BCH code with designed distance S has minimum distance 
at least 8. 


Proof. Let a be a primitive element of F q m and let C be a BCH code generated by 
g(x) := \cm(M ia) (x), M <a+i> (x ), . . . , M (a+S ~ 2) (x)). It is clear that the elements 
a a , , a a+s ~ 2 are roots of g(x). 

Suppose that the minimum distance d of C is less than 5. Then there exists a 

nonzero codeword c(x) = co+cixH \-c„-\x n ~ l such that wt(c(x)) = d < 8. 

By Lemma 8.1.16, we have c(a‘) = 0 for all i = a, . . . , a + 8 — 2; i.e., 



Assume that the support of c(x) is R = [i i, . . . , id), i.e., Cj ^ 0 if and only if 
jefi. Then (8.1) becomes 
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Since d < 8 — 1, we obtain the following system of equations by choosing the 
first d equations of the above system of equations: 


/ ( a ")' 1 {u a ) h (« a )' 3 

(a a+l ) h (a a+1 )' 2 (a a+ 1 y 3 

( oc a+2 ) h (a a+2 )' 2 (a a+2 )' 3 


(a a y“ \ 

(a a+1 yo 

(u a+2 ) id 




c h 

% 


= 0. (8.3) 


^a+d-ly'2 ^a+d-ly'3 


(o^a+d- 1 )^ y 


W 


The determinant D of the coefficient matrix of the above equation is equal to 
/I 1 1 ■■■ 1 \ 


D — det 


a “ a 12 


(a 2 )' 1 (a 2 )' 2 (a 2 )' 3 • • • ( a 2 ) id 


VCa^- 1 )' 1 (a" -1 )' 2 (a" -1 )' 3 ••• (a d - 1 )'7 


(8.4) 


= Yi (aik - a ") ^ °- 

,|nR. *>/ 

Combining (8.3) and (8.4), we obtain (c,-, , ,c id ) — 0. This is a 
contradiction. $&.; 


Example 8.1.19 (i) Let a be a root of 1 + x + x 3 € F 2 [a], and let C be the 
binary BCH code of length 7 with designed distance 4 generated by 

g(x) = lcm(M <0) (x), M w (x), M (2 ) (x)) = 1 + x 2 + x 3 + x 4 . 

Then d(C) < wt(g(x)) = 4. On the other hand, we have, by Theorem 8.1.18, 
that d(C) > 4. Hence, d(C) = 4. 

(ii) Let a be a root of 1 + x + x 4 e F 2 [x], Then a is a primitive element of 
F 16 . Consider the narrow-sense binary BCH code of length 15 with designed 
distance 7. Then the generator polynomial is 

g(x) = lcm (M a) (x), M (2 \x), M (6) (x )) 

= M m (x)M 0 \x)M & (x) 

= 1 + x + x 2 + x 4 + x 5 + x 8 + x 10 . 

Therefore, d(C) < wt(g(x)) = 7. On the other hand, we have, by 
Theorem 8.1.18, that d{C) > 7. Hence, d(C) = 7. 



168 


Some special cyclic codes 


Example 8.1.20 Let a be a primitive element of F 2 «« and let be the 

minimal polynomial of a with respect to F 2 . Consider the narrow-sense binary 
BCH code C of length n = 2 m — 1 with designed distance 3 generated by 

g(x) = lcm M (2 \x)) = M m (x). 

Then, d(C ) > 3 by Theorem 8.1.18. C is in fact a binary Hamming code by 
Exercise 7.20. Hence, d{C) = 3. 


8.1.3 Decoding of BCH codes 

The decoding algorithm we describe in this section is divided into three 
steps: (i) calculating the syndromes; (ii) finding the error locator polynomial; 
(iii) finding all roots of the error locator polynomial. For simplicity, we will 
discuss only the decoding of narrow-sense binary BCH codes. 

Let C be a narrow-sense binary BCH code of length n — 2 m — 1 with 
designed distance S = 2t + 1 generated by g(x) := lcm (M (1) (x), M (2) (x) 
M <s - r >(x)), where M (i \x) is the minimal polynomial of a' with respect 
to F 2 for a primitive element a of F 2 * . 

Put 


( 1 a 

(a) 2 ■ 

• (a)"” 1 \ 

1 a 2 (a 2 ) 2 

• (a 2 )”" 1 

1 a 3 

(a 3 ) 2 • 

• (a 3 )”" 1 

[l a 3 - 1 (a 3-1 ) 2 • 

•• (a 3 - 1 )"" 1 , 


Then it can be shown that a word c e FJ is a codeword of C if and only if 
c H t = 0 (see Exercise 8.9). Therefore, we can define the syndrome .S'h(w) of 
a word weFj with respect to H by wH T . Some properties of Sh(w) are also 
contained in Exercise 8.9. 

Suppose that w(x) = wq + W\X -\ + w„-\x n ~ l is a received word with 

the error polynomial e(x) satisfying wt(e(x)) < t. Put c(x) = w(x) — e(x), 
then c(x) is a codeword. 

Step 1; Calculation of syndromes The syndrome of w(x) is 
(so, Si,..., ss- 2 ) := (wo, w 1 , . . . , 

It is clear that s, = w(a ,+1 ) = e{a ,+l ) for all i = 0, 1, . . . , S — 2, since a ,+l 
are roots of g(x). 
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Assume that the errors take place at positions i'o, i\, , U-\ with / < t; i.e., 
e(x) = x io +x h + --- + x i ‘~ 1 . (8.6) 

Then we obtain a system of equations 

a ' 0 + a ' 1 + • • • + a“~' = so = w{a), 

(a ' 0 ) 2 + (a ' 1 ) 2 + b (a " -1 ) 2 = si = w{a 2 ), 

(8.7) 

(a ’®) 4 - 1 + (a ' 1 ) 4-1 + • • • + (a ''- 1 ) 4-1 = ^-2 = w(a 4_1 ). 

Any method for solving the above system of equations is a decoding algorithm 
for BCH codes. 

Step 2: Finding the error locator polynomial For e(x) = x'° + x” H 1- 

x l, -‘ , define the error locator polynomial by 

/-i 

<x(z) := ]~ [(1 - a‘ J z). 
j = o 

It is clear that the error positions ij can be found as long as all the roots of cr(z) 
are known. For this step, we have to determine the error locator polynomial 
o-(z). 

Theorem 8.1.21 Suppose the syndrome polynomial s(z) = ]P 4 I^ Sj z > is not 
the zero polynomial. Then there exists a nonzero polynomial r(z) e F2 m[z]such 
that deg(r(z)) < t — 1, gcd(r(z), cr(z)) = 1 and 

r(z) = s(z)a(z) (mod z 4 " 1 ). (8.8) 

Moreover, for any pair ( u(z ), u(z)) of nonzero polynomials over F 2 » satisfying 
deg (u(z)) < t — 1, deg(u(z)) < t and 

u(z) = s(z)v(z) (mod z 4_1 ), (8.9) 

we have 

o(z) = pv(z), r(z) = pu(z), (8.10) 

for a nonzero element P e F 2». . 

Proof. (Uniqueness.) Multiplying (8.8) by v(z ) and (8.9) by cr(z) gives 
u(z)r(z)^ ( r(z)M(z)(modz 4 - 1 ). 


( 8 . 11 ) 
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As deg (u(z)r(z)) < 2t - 1 = S - 2 and deg (er(z)w(z)) < 2f - 1 = 5 - 2, it 
follows from (8.11) that u(z)r(z) = er(z)w(z). By the conditions that 
gcd(r(z), er(z)) = 1 and all the polynomials are nonzero, we obtain a(z) = 
fiv(z) and r(z) = /lw(z) for a nonzero element fi e F 2 « . 

(Existence.) Put 


/-i 


r(z) = <r(z)£ 

7=0 


(1 - a'Jz)' 


r ( z ) _ 

«r(z) ~~ t* (! - « ,JZ ) 


7=0 ' 

= fy - '*)* 

7=0 £=0 

/-I 5-2 

7=0 £=0 

5-2 / /— 1 

-E 

£=0 \ 7=0 

= £ u>(</-+ v 

k=0 

= s(z) (mod z 5_1 ). 


As r(l /a ,J ) / 0 for all j, we know that gcd(r(z), o-(z)) = 1. This completes 
the proof. □ 

From the above theorem, we find that, to determine the error locator poly- 
nomial <t(z), it is sufficient to solve the polynomial congruence (8.8). This can 
be done by the Euclidean algorithm (see ref. [1 1]). 

Step 3: Finding the roots of the error locator polynomial This is easy 
to do as we can search for all possible roots by evaluating er(z) at cc ' , for all 

i = 1,2, After all the roots a ' 1 , . . . , a" of <r(z) are found, we obtain the 

error polynomial (8.6). 

We use an example to illustrate the above three steps. 


Example 8.1.22 Let a be a root of g(x) — I + x + x 3 e F 2 [x], Then the 
Hamming code generated by g(x) = lcm(A7 rl fix), M a> (x)) has the designed 
distance 5 = 3. Suppose that w(x) = 1 + x + x 2 + x 3 is a received word. 
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(i) Calculation of syndromes: 

(s 0 , si) = (w(a), w(a 2 )) = (a 2 , a 4 ). 

(ii) Finding the error locator polynomial. 

Solve the polynomial congruence 

r(z) = s(z)a(z) (modz 2 ) 
with deg(r(z)) = 0 and deg(er(z)) < 1, and 

s(z) = a 2 + a 4 z. 

We have er(z) = 1 + a 2 z and r(z) = a 2 . Hence, the error takes place at 
the third position. Thus, we decode w(x) to w(x) — x 2 = l + x + x 3 = 
1101000. 


8.2 Reed-Solomon codes 

The most important subclass of BCH codes is the class of Reed-Solomon (RS) 
codes. RS codes were introduced by I. S. Reed and G. Solomon independently 
of the work by A. Hocquenghem, R. C. Bose and D. K. Ray-Chaudhuri. 

Consider a q- ary BCH code C of length q m — 1 generated by g(x) := 
\cm(M ia) {x), M <a+r> (x), ... , M^ a+S ~ 2 \x)), where M {,) (x) is the minimal poly- 
nomial of a' with respect to F q for a primitive element a of F 9 » . If m = 1 , we 
obtain a -ary BCH code of length q — 1. In this case, a is a primitive element 
of F ? and, moreover, the minimal polynomial of a' with respect to ¥ q is x — a 1 . 
Thus, for S < q — 1, the generator polynomial becomes 

g(x) = lcm(x -ot a , x- a a+l , ...,x- a a+s ~ 2 ) 

= (x- a a )(x - a a+1 ) ■■■(x- a a+5 ~ 2 ) 
since a a , a a+l , . . . , a a+s ~ 2 are pairwise distinct. 

Definition 8.2.1 A <7 -ary Reed-Solomon code (RS code ) is a q- ary BCH code 
of length q — 1 generated by 

g(x) = (x- a a+1 )(x - a a+2 ) ■■■(x- a a+s ~ x ), 
with a > 0 and 2 < 8 < q — 1, where a is a primitive element of F q . 

We never consider binary RS codes as, in this case, the length is q — 1 = 1 . 

Example 8.2.2 (i) Consider the 7-ary RS code of length 6 with generator poly- 
nomial g(x) = (x — 3)(x — 3 2 )(x — 3 3 ) = 6 + x + 3x 2 + x 3 . This is a 7-ary 
[6, 3]-code. 
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We can form a generator matrix from g(x): 

(6 1 3 1 0 0\ 

G = 0 6 1 3 1 0 . 

\0 0 6 1 3 1/ 

A parity-check matrix 

/ 1 4 1 10 ()\ 

H = I 0 1 4 1 1 0 ] 

\0 0 1 4 1 1/ 

is obtained from h(x) = (x 6 — 1 )/g(x) = l+x + 4x 2 + x 3 . It can be checked 
from the above parity-check matrix that the minimum distance is 4. Hence, this 
is a 7-ary [6, 3, 4]-MDS code. 

(ii) Consider the 8-ary RS code of length 7 with generator polynomial g (x) = 
(x — a)(x — a 2 ) = 1 + a + (a 2 + a)x + x 2 , where a is a root of 1 + x + x 3 e 
F 2 [x], This is an 8-ary [7, 5]-code. 

We can form a generator matrix from g(x): 

* a + 1 a 2 + a 1 0 0 0 0^ 

0 a-t-lc^-t-a! 1 0 0 0 

G = 0 0 a-|-la 2 -|-a! 1 0 0. 

0 0 0 a + la^ + Q! 1 0 

^0 0 0 0 a+la 2 + aly 

A parity-check matrix 

/la 4 1 1 + a 4 1 + a 4 a 4 0\ 

_ VO 1 a 4 1 1 + a 4 1 + a 4 a 4 / 

is obtained from h(x) = (x 1 — 1 )/g(x) = a 4 + (1 + a 4 )x + (1 + a 4 )x 2 + 
x 3 + a 4 x 4 + x 5 . It can be checked from the above parity-check matrix that the 
minimum distance is 3. Hence, this is an 8-ary [7, 5, 3]-MDS code. 

The two RS codes in the above example are both MDS. In fact, it is true in 
general that RS codes are MDS. 

Theorem 8.2.3 Reed-Solomon codes are MDS; i.e., a q-ary ReedSolomon 
code of length q — 1 generated by g{x) — YYi=l+l (x — a 1 ) isa[q-l,q—8,8]- 
cyclic code for any 2 < 5 < q — 1. 

Proof. As the degree of g(x) is 8 — 1, the dimension of the code is exactly 
k := q - 1 - (5 - 1) = q - 8. 
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By Theorem 8.1.18, the minimum distance is at least 5. On the other hand, 
the minimum distance is at most (q — 1) + 1 — k = 8 by the Singleton bound 
(see Theorem 5.4.1). The desired result follows. □ 


Example 8.2.4 Let a be a root of 1 + x + x 4 e F 2 [x], Then a is a prim- 
itive element of Fi6. Consider the RS code generated by g(x) = (x — a 3 ) 
( x — u 4 )(x — a 5 )(x — a 6 ). It is not so easy to work out its minimum distance 
from its parity-check matrices. However, by Theorem 8.2.3, it is a [15, 11,5]- 
cyclic code over Fig. 

Next we consider the extended codes of RS codes. 


Theorem 8.2.5 Let C be a q-aryRS code generated by g(x) — nf=i ( x ~ a ') 
with 2 < 8 < q — 1. Then the extended code C is still MDS. 

Proof. Since C is a [q — 1, q — 8, <5]-cyclic code, we have to show that C is 
a [q, q — 8, 8 + l]-code. Let c(x) = Yll=o c > x ' a nonzero codeword of 
C. It is sufficient to prove that the Hamming weight of c = (co, . . . , c q - 2 , — 
EfJo c, ) is at least 5+1. Let c(x) = f(x)g(x ) for some f(x) e F 9 [x]/ 
(x?- 1 - 1). 

Case 1: /( 1) / 0. It is clear that g( 1 ) ^ 0. Hence, c(l) = YllZo 0 ’ ^ 0- 
Then the Hamming weight of c is equal to wt(c(x)) + 1, which is at least 
d(C)+ 1 = 5+1. 

Case 2: /( 1) = 0, i.e., (x — 1) is a linear factor of fix). Put fix) = 
(x - 1 )u{x). Then, c{x) = u(x)(x - l)g(x) = «(x)nf=o(^ - «') is 
also a codeword of the BCH code of designed distance 5+1 generated by 
n£J(* — «' ). Hence, the Hamming weight of c(x) is at least 5 + 1 by 
Theorem 8.1.18. Thus, the Hamming weight of c is at least 5 + 1. *<§1 

Example 8.2.6 (i) Consider the 7-ary [6, 3, 4]-RS code as in Example 8.2.2(i). 
By Theorem 5.1.9, the matrix 

( 1 4 1 1 0 0 0\ 

0 1 4 1 1 0 Oj 

0 0 14 110 
1111111/ 

is a parity-check matrix of the extended code. Hence, by Corollary 4.5.7, the 
extended code has minimum distance 5, and thus it is a [7, 3, 5]-MDS code. 



174 


Some special cyclic codes 


(ii) Consider the 8-ary [7, 5, 3]-RS code as in Example 8. 2.2(ii). By Theorem 
5.1.9, the matrix 

( 1 a 4 1 l+a 4 l+a 4 a 4 0 0\ 

0 1 a 4 1 l+a 4 l+a 4 a 4 0 ] 

111 1 1 1 11/ 

is a parity-check matrix of the extended code. Hence, the extended code has 
minimum distance 4, and thus it is an [8, 5, 4]-MDS code. 

RS codes are MDS codes. Hence, they have very good parameters. Un- 
fortunately, RS codes are nonbinary, while practical applications often require 
binary codes. In practice, the concatenation technique is used to produce binary 
codes from RS codes over extension fields of F 2 . 

Let C be an [n. k]-RS code over F 2 ««, where n = 2 m — I . Applying the 
concatenation technique as in Theorem 6.3.1, we concatenate C with the trivial 
code F™ . 

Let a\ , . . . , a m be an F 2 -basis of F 2 » and consider the map 0 : F 2 » -» F™ 

U\d\ + U2&2 H + u m a m Hr («i, « 2 , . . . , u m ). 

Then, by Theorem 6.3.1, we have the following result. 

Theorem 8.2.7 LetC bean [ n , k]-RS code over F 2 * , where n = 2'" — 1. Then 
<t>*(C) := mc 0 ), . . . , <A(c„_!)) : (c 0 , . . . , c„_i) e C} 
is a binary [mn, mk]-code with minimum distance at least n — k + 1. 

Example 8.2.8 Consider the 8-ary RS code C generated by 

6 6 

g(x) = - «') = XX’ 

.3% 1=0 

where a is a root of 1 + x + x 3 . Hence, 

C = {a(l, 1, 1, 1, 1, 1,1) : a e F 8 } 

is the trivial 8-ary [7, 1, 7]-MDS code. The code <p*(C) is a binary [21, 3, 7]- 
linear code spanned by 

100 100 ... 100, 010010 ... 010, 001001 ... 001. 

For an RS code C, the code cp*(C) cannot correct too many random errors 
as the minimum distance is not very big. However, it can correct many more 
burst errors. 
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Theorem 8.2.9 Let C be an [2 m — 1, k]-RS code over F 2 » . Then, the code 
4>*(C) can correct m\fn — k)/2\ — m + 1 burst errors, where n = 2'" — I is 
the length of the code. 

Proof. Put / = m [(n — k)/2\ — m + 1. By Theorem 7.5.3, it is sufficient to 
show that all the burst errors of length 1 or less lie in distinct cosets. 

Let ei, e 2 £ F™ " be two burst errors of length 1 or less that lie in the same 
coset of (p*(C ). Let c ,■ be the pre-image of e, under the map 0*; i.e., <f>*(c , ) = e, 
for i = 1,2. Then it is clear that 

wt(c,) < |"“~j + 1 


(as C is an MDS code), 

for i = 1,2, and Ci, C 2 are in the same coset of C. By Exercise 4.44, we know 
that Ci = C 2 . This means that ei = e 2 since <p* is injective. □ 

Example 8.2.10 For an 8-ary [7, 3, 5]-RS code, the code <fr*(C ) is a binary 
[21, 9]-linear code. It can correct 



burst errors. 



8.3 Quadratic-residue codes 

Quadratic-residue (QR) codes have been extensively studied for many years. 
Examples of good quadratic -residue codes are the binary [7, 4, 3]-Hamming 
code, the binary [23, 12, 7]-Golay code and the ternary [11,6, 5]-Golay code. 

Let p be a prime number bigger than 2 and choose a primitive element g 
of F p (we know the existence of primitive elements by Proposition 3.3.9(ii)). 
A nonzero element r of F ;J is called a quadratic residue modulo p if r = g 2 ‘ 
for some integer i when r is viewed as an element of F,,; otherwise, r is called 
a quadratic nonresidue modulo p. It is clear that r is a quadratic nonresidue 
modulo p if and only if r = g 2j ~ l for some integer j. 

Example 8.3.1 (i) Consider the finite field F 7 . It is easy to check that 3 is a 
primitive element of F 7 . Thus, the nonzero quadratic residues modulo 7 are 
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{3 2 ' : i = 0, 1, . . .} = {1, 2, 4}, and the quadratic nonresidues modulo 7 are 
{3 2 ' -1 : i = 1,2, ...} = {3,6,5}. 

(ii) Consider the finite field Fn. It is easy to check that 2 is a primitive 
element of Fn. Thus, the nonzero quadratic residues modulo 11 are {2 2 ' : 
i = 0, 1, . . .} = {1,4, 5, 9, 3}, and the quadratic nonresidues modulo 11 are 
{2 2 ' -1 : 1=1,2....} = {2,8, 10,7,6}. 

(iii) Consider the finite field F23. It is easy to check that 5 is a 

primitive element of F23. Thus, the nonzero quadratic residues modulo 
23 are {5 2 ' : i = 0,1,...} = {1, 2, 4, 8, 16, 9, 18, 13, 3, 6, 12}, and 

the quadratic nonresidues modulo 23 are {5 2,_1 : i — 1,2,...} = 

{5, 10, 20, 17, 11, 22, 21, 19, 15, 7, 14}. 

We now show that quadratic residues modulo p are independent of the choice 
of the primitive element. 

Proposition 8.3.2 A nonzero element r of F p is a nonzero quadratic residue 
modulo p if and only if r = a 1 (mod p) for some a e F*. In particular, 
quadratic residues modulo p are independent of the choice of the primitive 
element. 

Proof. Let g be a primitive element of F,,. If r is a nonzero quadratic residue 
modulo p, then, by the definition, r = g 2 ' for some integer i. Putting a = g ‘ , 
we have r = a 2 (mod p). 

Conversely, if r = a 2 (mod p) for some a e F*, then r = a 2 in F p . Since 
g is a primitive element of F p , there exists an integer i such that a = g‘ . Thus, 
r = g 2 ‘; i.e., r is a quadratic residue modulo p. □ 

Example 8.3.3 2 is a quadratic residue modulo 17 as 2 = 6 2 (mod 17). 

Proposition 8.3.4 Let p be an odd prime. Denote by Q p and A f p the sets of 
nonzero quadratic residues and quadratic nonresidues modulo p, respectively. 
Then we have the following. 

(i) The product of two quadratic residues modulo p is a quadratic residue 
modulo p. 

(ii) The product of two quadratic nonresidues modulo p is a quadratic residue 
modulo p. 

(iii) The product of a nonzero quadratic residue modulo p with a quadratic 
nonresidue modulo p is a quadratic nonresidue modulo p. 

(iv) There are exactly (p — l)/2 nonzero quadratic residues modulo p and 
ip ~ l)/2 quadratic nonresidues modulo p, and therefore F ; , = {0} U 
Q p UAf p . 
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(v) For a € Q p and ft e Af p , we have that 

otQp = {ar : r e Qp} = Qp, 

PQ„ = {Pr: re Qp)=N p , 
aN'p = {an : n e M p } = Mp 

and 

PM P = {pn: ne M p } = Q p . 

Proof. Let g be a primitive element of F ; ,, and let y, 0 be two quadratic residues 
modulo p. Then, there exist two integers i, j such that y = g 2 ‘ and 6 = g 2j . 
Hence, yd = g 2u+J) is a quadratic residue modulo p. 

The same arguments can be employed to prove parts (ii) and (iii). 

It is clear that all the nonzero quadratic residues modulo p are 

{g 2i : i =0,1,..., (P~ 3)/2}, 
and that all the quadratic nonresidues modulo p are 

lg 2i l ■ i = 1> 2, . . . , (p — l)/2}. 

Thus, part (iv) follows. 

Part (v) follows from parts (i)-(iv) immediately. 

Example 8.3.5 Consider the finite field Fn. The set of nonzero quadratic 
residues modulo 1 1 is Qn = { 1 , 4, 5, 9, 3}, and the set of quadratic nonresidues 
modulo 11 isA r u = {2, 8, 10, 7, 6}. We have \Qu\ — iMil = 5 = (11 — l)/2. 
Furthermore, by choosing 4 e Qn and 2 e J\f\ \ , we have 

4Q n = {4 • 1, 4 • 4, 4 • 5, 4 • 9, 4 • 3} = {4, 5, 9, 3, 1} = Qn, 

2Q n = {2 • 1, 2 • 4, 2 • 5, 2 • 9, 2 • 3} = {2, 8, 10, 7, 6} = Mi, 

4M U = {4 • 2, 4 • 8, 4 • 10, 4 • 7, 4 • 6} = {8, 10, 7, 6, 2} = Mi 

and 

2Mn = {2 • 2, 2 • 8, 2 • 10, 2 • 7, 2 • 6} = {4, 5, 9, 3, 1} = Qn- 

Choose a prime / such that / ^ p and / is a quadratic residue modulo p. 
Choose an integer m > 1 such that l m — 1 is divisible by p. Let 6 be a primitive 
element of F/» and put a = 0 (r ^ r> / p . Then, the order of a is p; i.e., 1 = a 0 = 
a p , a = a 1 , a 2 , ... , a p ~ l are pairwise distinct and x p — 1 = ]~If=o ( x ~ a> )- 
Consider the polynomials 

gQ(x) := Y[ ( x ~ “ r ) 311(1 8n(x) ■= ]~[ (x - a"). 

r€Q p neJVp 


( 8 . 12 ) 
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It follows from Proposition 8.3.4(iv) that 

x p -l=(x- l)g Q (x)gN(x). (8.13) 


Moreover, we have the following result. 

Lemma 8.3.6 The polynomials gQ{x) and gwix) belong to F 7 [x ] . 

Proof. It is sufficient to show that each coefficient of gg(x) and gn(x) belongs 
to F/. 

Let gQ(x) = ao + a\x + • • • + a^x k , where a,- e F /<» and k = (p — l)/2. 
Raising each coefficient to its /th power, we obtain 

a' 0 + a[x H 1- a[x k M: ]~[ (x - a ,r ) 

reQp 

- n ^ 

jelQp 

U Y\(x-aJ) 

jtQp 

= gQM. 

Note that we use the fact that IQ P = Q p in the above argument. Hence, 
a, = a\ for all 0 < i < m: i.e., a, are elements of F;. This means that Sq(x) is 
a polynomial over F/. 

The same argument can be used to show that gn(x) is a polynomial 
overF/. □ 

Example 8.3.7 (i) Let p = 7 and l = 2. Let a be a root of I + x+x 3 e F 2 [x], 
Then the order of a is 7. The two polynomials defined in (8.12) are 

gQ(x ) = Yi ( x _ “ r ) 

r^Qn 

d (x - a)(x - a 2 )(x - a 4 ) 

= 1 + X + x 3 


gN(x ) = ri (x ~ a") 

neAfj 

= lx- a 3 )(x - a 6 )(x - a 5 ) 
= 1 + x 2 + X 3 . 
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Furthermore, we have 

x 7 - 1 = (x - l)g Q (x)g N (x). 

(ii) Let p = 11 and / = 3. Let# be a root of 1 + 2x + x 5 e F 3 [x]. Then# is a 
primitive element of F 3 5 , and the order of a : = # 22 is 1 1. The two polynomials 
defined in (8.12) are 

gQ(x ) = FI (' v - ) 

reSll 

= (x - a)(x - a 4 )(x - a 5 )(x - a 9 )(x - a 3 ) 

= 2 + x 2 + 2x 3 + x 4 + x 5 


gjv(x) = I~[ (x - a") 

ne A/ii 

= (x - a 2 )(x - a 8 )(x - a w )(x - a 7 )(x - a 6 ) 
= 2 + 2x +x 2 + 2x 3 +x 5 . 


Furthermore, we have 

x 11 - 1 = (x - l)g Q (x)g N (x). 

(iii) Let p = 23 and / = 2. Let# be a root of\+x + x 3 +x 3 +x n e F 2 [x], 
Then # is a primitive element of F 2 n , and the order of a := # 89 is 23. The two 
polynomials defined in (8.12) are 

8q(x) = ]~[ (x - a r ) 

r&Qn 

= (x - a)(x - a 2 )(x - a 4 )(x - a s )(x - a 16 ) 

X (x - a 9 )(x - a 18 )(x - a 13 )(x - a 3 )(x - a 6 ){x - a 12 ) 

= l+x 2 +x 4 + x 5 +x 6 +x 10 + x 11 


8n(x ) = n (x - a") 
neNzs 

= (x - a 5 )(x - a 10 )(x - a 20 )(x - a ll )(x - a “) 

x (x - a 22 )(x - a 2l )(x - a 19 )(x - a 15 )(x - a 7 )(x - a 14 ) 

= 1 + jc + jc 5 +jc 6 + jc 7 +jc 9 + jc 11 _ 


Furthermore, we have 


X 23 - 1 = (x - l)g Q (x)g N (x). 
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Definition 8.3.8 Let p and l be two distinct primes such that l is a quadratic 
residue modulo p. Choose an integer m > 1 such that l m — 1 is divisible 
by p. Let 0 be a primitive element of F/» and put a = 6 ),r ~ l, / p . The divisors 
ofx p - 1 


8q(x) := ]""[ (x - a r ) and g N (x ) := ]""[(*- a") 

reQp neMp 

are defined over F/. The /-ary cyclic codes Cq =< 8q(x) > and C jv = 
< 8n(x) > of length p are called quadratic-residue ( QR ) codes. 

It is obvious that the dimensions of both the quadratic -residue codes Cq and 
C N are p - (p — l)/2 = (p + l)/2. 

Example 8.3.9 (i) Consider the binary quadratic -residue codes Cq =< 1 + 
x + x 3 > and Cn =< 1 + x 2 + x 3 > of length 7. It is easy to verify that 
these two codes are equivalent (see Proposition 8.3. 12) and that both are binary 
[7, 4, 3]-Hamming codes. 

(ii) Consider the ternary quadratic-residue codes Cq =< 2 + x 2 + 2x 3 + 
x 4 + x 5 > and C N =< 2 + 2x + x 2 + 2x 3 + x 5 > of length 11. It is easy 
to verify that these two codes are equivalent (see Proposition 8.3.12) and that 
both are equivalent to the ternary [11,6, 5]-Golay code defined in Section 5.3. 

(iii) Consider the binary quadratic-residue codes Cq =< \+x 2 + x 4 +x 5 + 
x 6 + x w + x n > and C N =< 1 + x + x 5 + x 6 + x 1 + x 9 + x 11 > of length 
23. It is easy to verify that these two codes are equivalent (see Proposition 
8. 3. 12) and that both are equivalent to the binary [23, 12, 7]-Golay code defined 
in Section 5.3. 

From the above example, we can see that the codes C q and Cn are equivalent 
in these three cases. This is, in fact, true in general. We prove the following 
lemma first. 

Lemma 8.3.10 Let m,n be two integers bigger than 1 and gcd {m, n) — I . 
Then the map 

Xm : F q [xy{x n - 1) -► T q [x]/{x n - 1), a{x) a(x m ) 
is a permutation of F” if we identify ¥ n q with [x]/(x n — 1) through the map 

n - 1 

* ■ (/o./i- •••>/« -i )>-+ 

i=0 
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Proof. Let f(x) = X!"=o f' x '- Then, we have 

XMM) = f(x m ) (mod x” - 1) = /,x ( ™ (mod »». 

i=0 

Hence, it is sufficient to show that 

0, ( m (mod «)), (2m (mod «)), . . . , ((« — 1 )m (mod «)) 

is a permutation of 0, 1, 2, .... n — 1. This is clearly true, as gcd (m, n) = 1. 

Ji 


Example 8.3.11 Consider the map 

/3 : F 3 [x]/(x 5 - 1) -* F 3 [x]/(x 5 - 1), a(x) h* a(x 3 ). 


Then, 

X 3 (l + 2x + x 4 ) = 1 + 2x 3 + x 12 (mod x 5 — 1) = l+x 2 + 2x 3 . 
Clearly, (1,0, 1, 2, 0) is a permutation of (1, 2, 0, 0, 1). 

Proposition 8.3.12 The two l-ary quadratic-residue codes Cq and Cn are 
equivalent. 

Proof. By definition, Cq =< gg(x) > and Cn =< gN(x ) >. Choose a 
quadratic nonresidue m modulo p and consider the map 

Xm : F ,[x]/(x p - 1) ^ F/[x]/(x p - 1), a(x) a(x m ). 

Then, Xm (C q ) is an equivalent code of Cg by Lemma 8.3.10. We claim that 
the code Xm(Cg) is m fact the same as Cn ■ This is equivalent to Xm(Cg) c Cn 
as |Xm(Cg)l = |Cjv|. Hence, it is sufficient to show that Xmfeg(^)) e C/v; i.e., 
gN(x) = rireA'/ 1 ' “ “') is a divisor of Xm(gQ(x)) = - “' )• Let 

t be a quadratic nonresidue modulo p, then tm is a nonzero quadratic residue 
modulo p. Hence, 

0 = gQ(a tm ) = gg((aT) = Xm(gQ(oi 1 ))- 

This implies that gjvC*) is a divisor of Xm(gQ(x)) as gjv(x) has no multiple 
roots. □ 

Finally, we determine the possible lengths p for which a binary quadratic- 
residue code exists; i.e., those primes p such that 2 is a quadratic residue 
modulo p. 
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Table 8.3. 


Length 

Dimension 

Distance 

7 

4 

3 

17 

9 

5 

23 

12 

7 

31 

16 

7 

41 

21 

9 

47 

24 

11 

71 

36 

11 

73 

37 

13 

79 

40 

15 

89 

45 

17 


Proposition 8.3.13 (i) Let p be an odd prime and let r be an integer such 
that gcd (r, p) — 1. Then, r is a quadratic residue modulo p if and only if 
r ( P~ D/2 = 1 (mod p). 

(ii) For an odd prime p, 2 is a quadratic residue modulo p if p is of the 
form p = 8m ± 1 , and it is a quadratic nonresidue modulo p if p is of the form 
p = 8/n ±3. 

Proof, (i) Let g be a primitive element of F ;) . If r is a quadratic residue 
modulo p, then r = g 2 ‘ for some i. Hence, r (p_1)/2 = = 1 in F p ; i.e., 

r ( P~ D/2 = 1 (mod p). 

Conversely, suppose that r^ -1 ^ 2 = 1 (mod p). Let r = g' for some integer 
j. Then gFp 'V 1 — 1 in F,,. This means that j(p — l)/2 is divisible by p — I ; 
i.e., j is even. 

(ii) Consider the following (p — l)/2 numbers: 

2x1 2x2 ... 2 x l(p- 1)/4J 

p — 2(l(p — 1)/4J + 1) p — 2(l(p — 1)/4J + 2) ... p-2((p- I )/2). 

All of these (p — l)/2 numbers are between 1 and (p — l)/2 (both inclusive) 
and it is easy to verify that they are pairwise distinct. Thus, their product is 
equal to 


P~ 1 
2 


L(i>-b/4J 

= n ^ 


(P- D/2 

II 


;=L(p-D/4j+i 


ft ( P 2 1 (mod p), 
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where e = (p-\)/2-\_(p-l)/4\. Hence, we obtain 2^ >/ 2 = (-l) e (mod p). 
It is easy to check that e is even if and only if p is of the form p = 8m ± 1, 
and that e is odd if and only if p is of the form p = 8m ± 3. The desired result 
then follows from part (i). * 

Corollary 8.3.14 There exist binary quadratic-residue codes of length p if and 
only if p is a prime of the form p = 8m ± 1. 

Example 8.3.15 We list the parameters of the first ten binary quadratic-residue 
codes in Table 8.3. 


Exercises 

8. 1 Find the least common multiple of the following polynomials over F 2 : 

ffx)=l+x 2 , f 2 {x)= l+x + x 2 + x 4 , Mx)=1+x 2 +x 4 +x 6 . 

8.2 Suppose we have three nonzero polynomials f\{x), f 2 (x) and h(x). 
Show that lcm(/i(x), f 2 (x), ffx)) = lcm(lcm(/|(x), f 2 (x)), f 3 (x)). 

8.3 Construct a generator polynomial and a parity-check matrix for a binary 
double-error-correcting BCH code of length 15. 

8.4 Let a be a root of 1 + x + x 4 e F 2 [x]. 

(a) Show that a 7 is a primitive element of F 16 , and find the minimal 
polynomial of a 1 with respect to F 2 . 

(b) Let g(x) € F 2 [a] be the polynomial of lowest degree such that 
g(a 7 ') = 0, for i = 1,2, 3, 4. Determine g(x) and construct a 
parity-check matrix of the binary cyclic code generated by g(x). 

8.5 Determine the generator polynomials of all binary BCH codes of 
length 31 with designed distance 5. 

8.6 Construct the generator polynomial for a self-orthogonal binary BCH 
code of length 31 and dimension 15. 

8.7 Let a be a root ofl+x+x 4 e F 2 [x]. Let C be the narrow-sense binary 
BCH code of length 15 with designed distance 5. 

(a) Find the generator polynomial of C. 

(b) If possible, determine the error positions of the following received 
words: 

(i) ie(x)= 1 + x 6 + x 1 + x*\ 

(ii) w(x) = l + x + x 4 + x 5 + x 6 + x 9 ; 

(iii) w(x) — 1 +x +X 1 . 



184 


Some special cyclic codes 


8.8 Let a be a root ofl+x+x 4 e F 2 M. Let C be the narrow-sense binary 
BCH code of length 15 with designed distance 7. 

(a) Show that C is generated by g(x) = l+x +x 2 +x 4 +x 5 +x s +x w . 

(b) Let w(x) = I + x + x 6 + x 1 + x 8 be a received word. Find the 
syndrome polynomial, the error locator polynomial and decode the 
word ijj(x). 

8.9 Let C be a narrow-sense q- ary BCH code of length n = q m — I 
with designed distance <5 generated by g(x) := lcm (M m (x), 
M (2 ) (x), . . . , M (S ~% <)), where M (i \x ) is the minimal polynomial of a' 
with respect to F q for a primitive element a of . 

Put 


/l a 

(«) 2 • 

• ( a )"” 1 \ 

1 a 2 

( a 2 ) 2 • 

• (a 2 ) 1 - 1 

1 a 3 

( a 3 ) 2 • 

■ ■ (a 3 ) 1 - 1 


\1 a ^ 1 ( a 8 - 1 ) 2 ••• ( a 5-1 )" -1 / 


Define the syndrome S h (w) of a word w e with respect to H by w H T . 

Show that, for any two words u, v e F", we have 

(a) S H ( u + v) = S H ( u) + S H (v); 

(b) Stf(u) = 0 if and only if u e C ; 

(c) S/zCu) = ,Sh(v) if and only if u and v are in the same coset of C . 

8.10 Show that the minimum distance of a narrow-sense binary BCH code is 
always odd. 

8.1 1 Show that a narrow-sense binary BCH code of length n = 2 m — 1 and 
designed distance 2t + 1 has minimum distance 2t + 1, provided that 



> 2 mt . 


8.12 Show that the narrow-sense binary BCH codes of length 31 and designed 
distance S = 3,5,7 have minimum distance 3,5,7, respectively. 

8.13 Show that the minimum distance of a q - ary BCH code of length n and 
designed distance S is equal to 8, provided that n is divisible by 8. 

8.14 (i) Show that the cyclotomic cosets C\, C 3, C 5 , . . . , C 2 t +i of 2 modulo 

2 m — 1 are pairwise distinct and that each contains exactly m elements, 
provided 


2t + 1 < 2 Lm/2J + 1. 
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(ii) Show that a narrow-sense binary BCH code of length n = 2 m — 1 
with designed distance 2t + 1 has dimension n — mt if 

2t + 1 < 2 Lm/2J + 1 . 


8.15 Determine whether the dual of an arbitrary BCH code is a BCH code. 

8.16 Find a generator matrix of a [10, 6 ]-RS code over Fn and determine the 
minimum distance. 

8.17 Determine the generator polynomial of a 16-ary RS code of dimension 
10 and find a parity-check matrix. 

8.18 Show that, for all n < q and 1 < k < n, there exists an [n, k]-MDS code 
over ¥ q . 

8.19 Show that the dual of an RS code is again an RS code. 

8.20 Determine the generator polynomials of all the 16-ary self-orthogonal RS 
codes. 

8.21 Let a be a root of 1 + x + x 2 e F 2 [x], Consider the map 

0 : F 2 2 -»• Fj, a 0 + a\a Hr ( a 0 , a\). 


Let C be a [3, 2]-RS code over F 4 . Determine the parameters of <p*(C ). 

8.22 Let C be a < 7 -ary RS code generated by g(x) = nf=i ( x ~ a ' ) with 
3 < 8 < q — 2, where a is a primitive element of F 9 . Show that the 
extended code C is equivalent to a cyclic code if and only if q is a prime. 

8.23 Determine all quadratic residues modulo p = 17, 29, 31, respectively. 

8.24 For an odd prime p, define Legendre’s symbol by 



(a) Show that 


(b) Show that 


if p\a 

if a is a quadratic residue and gcd(a , p) = 1 
if a is a quadratic nonresidue. 


aO’W 2 



(mod p). 




(c) Show that, if q is an odd prime and p ± q, then 


| 1 | | E. | = ')(</ D/4 

\p) \<I ) 


(Note: this is the law of quadratic reciprocity.) 
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8.25 For the following primes p, l and elements a, determine the polynomials 
gg(x) and g/v(x) over F; as defined in Definition 8.3.8. 

(a) p =1,1 = 2 and a is a root of 1 + x 2 + x 3 e F 2 [x], 

(b) p = 17, / = 2 and a is a root of l + x 2 + x 3 + x 4 + x s e F 2 [jc]. 

(c) p = 13, Z = 3 and a is a root of 2 + x 2 + x 3 e F 3 [x], 

8.26 Determine the parameters of the QR codes generated by $q(x) (gtf(x), 
respectively) of Exercise 8.25. 


Problems 8.27-8.31 are designed to determine the square root bound on the 
minimum distance of binary QR codes. 


8.27 Let p be a prime of the form 8m ± 1. Define 

1 + x ‘ if P i s °f th e f° rm 8/n ~ i 
XiieA/" x ‘ if P i s °f ii 16 f orm 8/n + 1. 

(i) Show that 9 in Definition 8.3.8 can be chosen properly so that Eq(x) 
is an idempotent of the binary QR code Cq of length p. 

(ii) Put Eq(x) = ^{=q e;x' and define the p x p circulant matrix over 

F 2 : 


( eo 

e\ ■ 

e P — t'* 

e P -i 

eo • 

• Cp-2 

K ei 

e 2 • 

■■ e 0 / 


Show that every codeword of the binary QR code Cq of length p is 
a linear combination of the rows of the matrix 



G := 



8.28 Let Uj e F,, be the multiplicative inverse of i e F* . Show that, for any 
codeword c(x) = ]C{’Sq c, x‘ of even weight in the binary QR code Cq of 
length p, the word J2i=\ CiX~ Ui belongs to Cq. (Hint: Show that c(x) is 
a linear combination of the rows of G i , and then prove that the statement 
is true if c(x) is a row of G i .) 

8.29 Use Exercise 8.28 to show that the minimum distance of a binary QR 
code is odd. 

8.30 Show that the minimum distance of a binary QR code of length p is at 
least y/p. 

8.31 Let p be a prime of the form 4k — I . 

(i) Show that — 1 is a quadratic nonresidue modulo p. 
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(ii) Show that the minimum distance d of the binary [p,(p + l)/2]-QR 
codes Cq,Cn satisfies d 2 — d + 1 > p. 

8.32 Let gg(x ) and gu(x) be the two polynomials defined in Definition 8.3.8. 
The binary codes C g andCjy generated by (x — 1 )gg(x) and (x — 1 )gsi(x% 
respectively, are called expurgated QR codes. 

(a) Show that Cg and CV have dimension (p — l)/2 and minimum dis- 
tance at least Jp. 

(b) If p is of the form 4k — 1 , show that Cq = Cq and = Cn- 

(c) If p is of the form 4k + 1 , show that = Cg and Cq = Cn- 
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V. D. Goppa described an interesting new class of linear error-correcting codes, 
commonly called Goppa codes, in the early 1970s. This class of codes includes 
the narrow-sense BCH codes. It turned out that Goppa codes also form arguably 
the most interesting subclass of alternant codes, introduced by H. J. Helgert in 
1974. The class of alternant codes is a large and interesting family which 
contains well known codes such as the BCH codes and the Goppa codes. 


9.1 Generalized Reed-Solomon codes 

We encountered Reed-Solomon (RS) codes in Section 8.2 as a special class of 
BCH codes. Recall that an RS code over ¥ q is a BCH code over ¥ q of length 
q — 1 generated by 

g(x ) = (x - a a )(x - a a+l ) - - - (jc — a a+s ~ 2 ), 

with a > 1 and q — 1 > 8 > 2, where a is a primitive element of F ? . It is an 
MDS code with parameters [q — 1, q — 8, 5] (cf. Theorem 8.2.3). 

Consider the case of the narrow-sense RS codes, i.e., where a — 1. In this 
case, there is an alternative description of the RS code that is convenient for 
our purpose in this chapter. 

Theorem 9.1.1 Let a be a primitive element of the finite field ¥ qt and let 
q — \ > 8 > 2. The narrow-sense q-ary RS code with generator polynomial 

g(x) = (x - ct)(x -a 2 )---(x- a s ~ l ) 

is equal to 

{(/( 1), /(«), /(a 2 ), • ■ • , /(a ? - 2 )) : f(x) e ¥ q [x] and deg(/(x)) < q - 8}. 

(9.1) 
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Proof. It is easy to verify that the set in (9.1) is a vector space over ¥ q . We first 
show that it is contained in the RS code generated by gix). 

The codeword c == (/( 1), /(a), /(a 2 ), . . . , f{a q ~ 2 )) corresponds to the 
polynomial c(x) = Y^!=o f( a> )x l e ¥ q [x]/ix n — 1). We need to show that 
gix) divides c(x) (cf. Lemma 8.1.16); i.e., 

c( a) = c(a 2 ) = ...= c(a s ~ l ) = 0. 

Note that, for 1 < k < q — 2, we have a>k — i(oc k ) q ~ l — 1)/ 
(a* - 1) = 0. 

Write fix) = o" 1 fjxK Then, for 1 < l < S - 1, 

9-2 9-2 M&K \ 9-$4'~ 1 9-2 \ 

c(« < )=E/(° i )(“ < ) i =E( E fj<* ij ) ait = E 

1=0 i=o\ j=0 ) j = 0 \i=0 / 

since 1 < / + 1 < q — 2. 

The map / i->- (/( 1), /(a), /(a 2 ), . . . , f(a q ~ 2 )) from the set of polynomi- 
als in ¥ q [x] of degree < q — S to the set in (9.1) is injective. (Any f(x) in the 
kernel of this map must have at least q — 1 > q — S > deg (fix)) zeros, but this 
is only possible if fix) is identically equal to 0.) This map is clearly surjective, 
hence it is an isomorphism of F ? -vector spaces. Therefore, the dimension over 
F q of the vector space in (9.1) is q — S, which is the dimension of the RS code 
generated by gix). Hence, the theorem follows. □ 

The following corollary gives another explicit generator matrix for the 
narrow-sense RS code. 

Corollary 9.1.2 Let a be a primitive element of ¥ q , and let q — 1 > 8 > 2. 
The matrix 

/II 1 ••• 1 \ 

1 a a 2 • • • a q ~ 2 

1 a 2 a 4 ■■■ a 2(q ~ 2) 

^1 a q ~ S ~ l a 2(q-8-l) . . . a (?-i-l)(?-2) J 

is a generator matrix for the RS code generated by the polynomial 
gix) = (x - a)ix - a 2 ) ■ ■ ■ ix - a s ~ *). 

An easy generalization of the description of the RS code in Theorem 9.1.1 
leads to a more general class of codes which are also MDS. 
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Definition 9.1.3 Let n < q. Let a — {oi\, a 2 , ■ ■ . , ot„), where a, (1 < i < n ) 
are distinct elements of F ? . Let v = ( 01 , 02 ,..., v n ), where o ; - e F* for all 
1 < i < n. For k < n, the generalized Reed-Solomon code GRSkia, v) is 
defined to be 

{(t>i/(ai), v 2 f(a 2 ), ■ • • , v n f{a n )) : fix) e F„|.r| and deg(/(x)) < k}. 
The elements ai, a 2 , . . . , a„ are called the code locators of GRSkia, v). 

Theorem 9.1.4 The generalized RS code GRSkia, v) has parameters 
[n, k, n — k + 1], so it is an MDS code. 

Proof. It is obvious that GRSkia , v) has length n. The same argument as in 
the proof of Theorem 9.1.1 also shows that its dimension is k. It remains to 
show that its minimum distance is n — k + 1. 

To do this, we count the maximum number of zeros in a nonzero codeword. 
Suppose f{x) is not identically zero. Since deg(/(x)) < k, the polynomial 
fix) can only have at most k — I zeros; i.e., the codeword iv\fia\), v 2 fia 2 ), 

. . . , v n fia n )) has at most k — 1 zeros among its coordinates. In other words, its 
weight is at least n — k + 1, so the minimum distance d of GRSkicx, v) satisfies 
d > n — k + 1. However, the Singleton bound shows that d < n — k + l,so 
d = n — k + 1. Hence, GRSkia, v) is MDS. □ 

Remark 9.1.5 In the case where v = (1, 1, ■ . . , 1) and n < q — 1, the gener- 
alized RS code constructed is often called a punctured RS code, as it can be 
obtained by puncturing an RS code at suitable coordinates. 

As for RS codes (cf. Exercise 8.19), the dual of a generalized Reed-Solomon 
code is again a generalized Reed-Solomon code. 

Theorem 9.1.6 The dual of the generalized Reed-Solomon code GRSkia , v) 
over ¥ q of length n is GRS n -kia, V) for some V e (F*)". 

Proof. First, let k = n — 1. From Theorems 5.4.5 and 9.1.4, the dual of 
GRS„-iia , v) is an MDS code of dimension 1, so it has parameters [«, 1, n\. 
In particular, its basis consists of a vector v' = (uj , . . . , v' n ), where v\ e F* for 
all 1 < i < n. Clearly, this dual code is GRSfa, V). 

It follows, in particular, that, for all fix) € F ? [x] of degree < n — 1, we 
have 


viv\fiai) H b v n v' n fia n ) = 0, (9.2) 


where v = (ui, . . . , v„). 
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Now, for arbitrary k, we claim that GRSk(a, v) x = GRS n -k{ot, V). 

A typical codeword in GRSk(a, v) is (v\ f(a \ ), . . . , v„f(a„)), where fix) e 
[x] with degree < k — 1 , while a typical codeword in GRS n -k(a, v') has the 
form (i/jgfoi), . . . , v' n g(a„)), with g(x) e F ? [x] of degree < n — k — 1. Since 
deg(/(x)g(x)) < n — 2 < n — 1, we have 

(th/foi), • • • , v n f(a n )) • {v\g(a \), . . . , v' n g(a,,)) = 0 

from (9.2). 

Therefore, GRS n -k(a, v') c GRSk(a, v) x . Comparing the dimensions of 
both codes, the theorem follows. □ 


Corollary 9.1.7 A parity-check matrix of G RSk(a. v) is 
l v 'x v'i ■■■ K \ 


v i a i 


v 2 a 2 


\v[a" 


v n a n 

v'„al 


/ 1 

1 

i \ 


( v \ 

0 

••• 0\ 

a\ 

a\ 

4 ■ 

Oi n 

■ ■ «„ 2 


0 

v'i 

... 0 

Ur*- 1 

«r^ ■ 



Vo 

0 

••• vj 


Remark 9.1.8 Recall that v' = (v\, . . . , v' n ) is any vector that generates the 
dual of G RS n _\(a, v), so it is not unique (cf. Exercise 9.1). In particular, the 
parity-check matrix in Corollary 9.1.7 is also not unique. 


9.2 Alternant codes 

An interesting family of codes arising from the generalized RS codes of the 
previous section is the class of alternant codes. This is quite a large family that 
includes the Hamming codes and the BCH codes. 

We use the same notation as in the previous section, except that the gener- 
alized RS codes are now defined over F q m , for some m > 1 . 

Definition 9.2.1 An alternant code Ak{oc, v') over the finite field F ? is the 
subfield subcode GRSk(a, v)|f, ( , where GRSk(a, v) is a generalized RS code 
over ¥ q m , for some m > 1 . 
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Remark 9.2.3 below explains why we have chosen v' in the notation for the 
alternant code instead of v. 

Proposition 9.2.2 The alternant code Ak(ot, v') has parameters [n, k' , d\, 
where mk — (m — 1 )n < k' < k and d > n — k + 1. 

Proof. By Theorem 9.1. 4, GRS^a, v) has parameters [«,£, « — k+\]. Hence, 
Akict, v') clearly has length n, and its dimension k' trivially satisfies k ' < k. 
The result follows from Theorem 6.3.5. * -'j-feJ. 

Remark 9.2.3 It follows directly from Definition 9.2. 1 and Corollary 9.1.7 that 
Ak(oc, V) is none other than 

{c e F” : c H r = 0), 

where H is the matrix in Corollary 9.1.7. Since H is determined by a and v', 
it is appropriate for the notation for the alternant code to be expressed in terms 
of a and v'. 

Recall that every element ft e F q m can be written uniquely in the form 
XX7)' A‘ a ’> where a is a primitive element of F ? » and /3, e F 9 , for all 0 < 
i < m — I . Therefore, if we replace every entry f) of H by the column vector 
G6o, ■ • ■ , P m - i) T , we obtain an (n — k)m x n matrix H with entries in ¥ q such 
that Akia, v') is 

{ceFJ : c Jf = 0}. 

This matrix H plays the role of a parity-check matrix of AAa, v'), except that 
its rows are not necessarily linearly independent, so we refrain from calling it 
a parity-check matrix of Ak(ot, V). (However, this appellation is used in some 
books.) 

We now look at some examples of alternant codes. 

Example 9.2.4 (i) Let q = 2 and let m be any integer > 3. Let a be a primitive 
element of F 2 m. Set 

v' = (1, a, or , ... , o' 2 ” 1-2 ). 

For any a = («i, . . . , a^-i), where {ari, . . . , o^-i} = F*„ , the alternant code 
A 2 ’»- 2 (a, V) is 

^ 2 »>- 2 (q:, v') = {c e F^” -1 : c(l, a, a 2 , . . . , o' 2 '” -2 ) 1 = 0}. 
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It is clear that, for H = (1, a, a 1 , ... , a 2 ” -2 ), H is an m x (2 m — 1) matrix 
whose columns are all the nonzero vectors in F™ . Recall that this is a parity- 
check matrix for the binary Hamming code Ham (m, 2), so v') = 

Ham(/n, 2). 

(ii) For any q and m, recall from (8.1) that a BCH code over Y q is a code 
consisting of all ceFJ that satisfy cH' T = 0, where 



which is exactly in the form of Corollary 9.1.7. Therefore, a BCH code is also 
an alternant code. 

(iii) Let q = 2 and m — 3, and set n — 6. Let a be a primitive element of Fg 
that satisfies a 3 + a + 1 = 0. Take V = (1, . . . , 1) and a = (a, a 2 , , a 6 ). 
Then ^(a, v') = {c e F^ : c H J = 0}, where 


Then 



f\ 1111 
0 0 0 0 0 
0 0 0 0 0 
0 0 10 1 
H= 10111 
0 10 11 
0 0 10 1 
0 10 11 
vl 1 1 0 0 


0 

0 

1 

0 
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which has the following reduced row echelon form: 

/l 0 0 0 1 1\ 

0 1 0 0 0 1 

0 0 10 11 

0 0 0 1 1 0 

0 0 0 0 0 0. 

0 0 0 0 0 0 

0 0 0 0 0 0 

0 0 0 0 0 0 

\0 0 0 0 0 0 / 

Hence, it follows that * 4 . 3 ( 0 :. v') has a generator matrix 

/I 0 1 1 1 0\ 

\l 1 1 0 0 lj’ 


so it is a [6, 2, 4] -code. 

The following description of the dual of an alternant code is an immediate 
consequence of Theorems 6.3.9 and 9.1.6. 


Theorem 9.2.5 The dual of the alternant code Ak(ot, v') is 


Tr F?m /Y q (GRS n -k{oL, v')). 


The following theorem shows the existence of an alternant code with certain 
parameters. 

Theorem 9.2.6 Given any positive integers n, h, 8, m, there exists an alternant 
code Akia, V) over¥ q , which is the subfield subcode of a generalized RS code 
over F q m, with parameters [n, k', d], where k' >h and d >8, so long as 

- ir ) < (q m - l) L( "“ , ‘ )/mJ . (9.3) 

Proof. For any vector ceFJ, let 

R(a, k, c) = {v e (F* m ) n : c e GRS k {cx, v)}. 

Writing c = (ci, . . . , c n ) and v = ( i> 1 , . . . . v„), we have that c, — v //(a,), 
where fix) e F ? [x] has degree < k, for all 1 < i < n. For a fixed c, fix) is 
fixed once k values of v, are chosen. Therefore, 

|/?(a, k, c)| < (q m — J)*. 
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The number of vectors c e FJJ of weight < S is given by X!t=o (d — 1)“\ 
so, taking k = n — [(n — h)/m\, we have 

U R(a,k,c) 

wt(c)<5 
ceF" 

Now, 

l(F*m)"| = (q m ~ If. 

Therefore, if (9.3) is satisfied, then 

U R(a,k,c) 

Wt(c )<S 
ceF" 

is strictly smaller than (F* m ) n ; i.e., there exists v e (F*„ )" such that G RSk(a, v) 
does not contain any vector of ¥ q of weight < 8. Hence, the alternant code 
Akioc, v') has distanced 8. Its length is clearly n. Since/. = n— \_(n—h)/m\ > 
((/?j — 1 )n + h)/m. Proposition 9.2.2 implies that the dimension k' satisfies 
k' > mk — (m — 1 )n > h. □ 




9.3 Goppa codes 

One of the most interesting subclasses of alternant codes is the family of Goppa 
codes, introduced by V. D. Goppa [3, 4] in the early 1970s. This family also 
contains long codes that have good parameters. Goppa codes are used also in 
cryptography - the McEliece cryptosystem and the Niederreiter cryptosystem 
are examples of public-key cryptosystems that use Goppa codes. 

Definition 9.3.1 Let g(z ) be a polynomial in F q m [z] for some fixed m, and let 
L = {oi\, . . . , a n } be a subset of F q m such that L fl {zeros of g(z)} = 0. For 
c = (Cl r„) G F". let 

*c(z) = E— . 

U z - «<■ 

The Goppa code T{L , g) is defined as 

T(L, g) = {c e FJ : 7? c (z) = 0 (mod ^(z))}. 

The polynomial g(z) is called the Goppa polynomial. When g(z) is irreducible, 
T(L , g) is called an irreducible Goppa code. 
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Remark 9.3.2 (i) Notice that 

= 1 (mod g(z)), 

and, since (g(z) — g(cti))/(z — a,) is a polynomial, it follows that, modulo g(z), 
l/(z — a,) may be regarded as a polynomial; i.e., 

1 g(z) - g(oti) i , , .. 

m g(a t ) (mod g(z)). 

z — a,- z — a, 

Hence, the congruence R c (z) = 0 (mod g(z)) in the definition of T(L , g) means 
that g(z) divides the polynomial 

V' g(z)-g(<Xi) , ,_i 

V Ci g(a , ) . 

U z -«' 

However, noting that (g(z) — g(a,))/(z — a, ) is a polynomial of degree < t if 
g(z) has degree t , it follows that c e T(L , g) if and only if 

Ster'-O (9.4) 

rr - - «.• 

as a polynomial. 

(ii) It is clear from the definition that Goppa codes are linear. 

The next proposition shows immediately that Goppa codes are examples of 
alternant codes. 



Proposition 9.3.3 For a given Goppa polynomial g(z) of degree t and L = 


{on- 


, a n }, we have T(L , g) = {c e F" 


c// T = 0}, where 


H 


/ s(ah) 1 

I O'ig(O'i)" 1 


g(a n ) 1 \ 

a n g(oi„)- 1 


V a \ V(«i) 1 


a‘ n l g(a „ ) 1 / 


Proof. Recall that c e T(L , g) if and only if (9.4) holds. 

Substituting g(z) = Y?i = o 8i z> into (9-4), and equating the coefficients of 
the various powers of z to 0, it follows that c e T(L , g) if and only if cH n = 0 , 
where 

^ gfgfai) - * . ••• gtg(<*n)~ l \ 

(g,-i +aig t )g(ai)n> ••• (g,_i + a n g,)g(a n )~ l 1 

" : , 

\(st + «i#2~ \-a[ g/)g(ai) 1 ••• (gi+a„g2-\ \-a'„ '/ 
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We see easily that H' can also be decomposed as 



It now follows from Exercise 4.38 that c e T(L , g) if and only if c// T = 0, 
where 



□ 

Corollary 9.3.4 For a given Goppa polynomial g(z) of degree t and L = 
{a i , . . . , a,,}, the Goppa code T(L , g) is the alternant code A n - t (a, V), where 
OL = {a i ,a„) and V = (g(a i) _1 , . . . , ^(a„) _1 ). 

We can also obtain directly a description of the Goppa code as a subfield 
subcode of a generalized RS code. 

Theorem 9.3.5 With notation as above, the Goppa code T{L,g) is 
GRS„-,(a , v)|f ? , where v = (ui , . . . , v„) with i>, = — a;)), 

for all 1 <i<n. 

Proof. From Proposition 9.3.3, it is clear that F(L, g) = GRS,(a, v , ) ± |f ? , 
where v' = (g(ai) _1 , . . . , g(a n )~ l ). Hence, it is enough to show that 
GRSfa, v') x = GRS n -fa, v) (cf. Theorem 9.1.6); i.e., 

vig(ai)~ l f(ai) H f v n g{a n )~ l f(ot„) = 0, 

where v = (t?i , . ... v„) with u, = — a,)), for all 1 < i < n, 

and for all polynomials f(x)e F q m [x] of degree < n — 2. 

Since f(x) is a polynomial of degree < n — 2, it is determined by its values 
at < n — 1 points, so it follows that (cf. Exercise 3.26) 

- : ) 
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Equating the coefficients of z" ', we obtain (since deg (f(x)) < n — 2) 

° = Y] t — r t = vig{aiT l f(ax) + 1- v n g(a n )~ l /(«„). 

U FI ~ «;) 

□ 

By Proposition 9.2.2, Corollary 9.3.4 (or, equivalently, Theorem 9.3.5) also 
gives immediately a bound for both the dimension and the minimum distance 
of a Goppa code. 

Corollary 9.3.6 For a given Goppa polynomial g(z) of degree t and L = 
{« i , ... , «„}, the Goppa code T(L,g) is a linear code over¥ q with parameters 
[«, k , d], where k >n — mt and d > t + 1. 

The following description of the dual of a Goppa code now follows imme- 
diately from Theorem 9.2.5. 

Corollary 9.3.7 With notation as above, the dual of the Goppa code T(L, g) 
is the trace code Trp ,„/ f , (G RS t (a, v')), where v' = (g(ct\ )~' . . . . , g(a n )~ l ). 

When q = 2, i.e., in the binary case, a sharpening of the lower bound on d 
can be obtained. 

For a given polynomial g{z), we write g(z) for the lowest degree perfect 
square polynomial that is divisible by g(z). Denote by t the degree of g(z). 

For a vector c = (cj, . . . , c„) e ¥ n q of weight w, with cq = • • • = c, w = 1, 
say, let 

/ c (z) = f[(z - a,,). 

7=1 

Taking its derivative yields 

/c'w = EI> -«<,)• 

t= i m 

Hence, we have 

R c(z)=^. (9.5) 

Proposition 9.3.8 Let q = 2. With notation as above, c e F" belongs to 
T(L , g) if and only if g(z) divides / c '(z). Consequently, the minimum distance 
d ofV(L, g) satisfies d >t + 1. In particular, ifg(z) has no multiple root (i.e., 
g(z) is a separable polynomial), then d > 2t + 1. 
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Proof. By definition, c e T(L, g) if and only if R c (z) = 0 (mod g(z)). From 
(9.5), and noting that / c (z) and g(z) have no common factors, it follows that 
c € r(L,g) if and only if g(z) divides / c '(z). However, as we are working in 
characteristic 2, / c '(z), being the derivative of a polynomial, contains only even 
powers of z and is hence a perfect square polynomial. Therefore, g(z) divides 
/ c '(z) if and only if g(z) divides / c '(z). This proves the first statement of the 
proposition. 

If c is a codeword of minimum weight d in T{L,g), then / c (z) has degree 
d, so / c '(z) has degree < d — 1. The condition that g(z) divides / c '(z) implies 
that d - 1 > deg(/ c '(z)) > deg(g(z)) = t. 

If g(z) has no multiple root, then clearly g(z) = ( g(z )) 2 , so I = 2t. □ 

Remark 9.3.9 (i) When g(z) is separable, the Goppa code T(L, g) is said to 
be separable. 

(ii) If it is known that the minimum distance d of T(L , g) is even, then the 
bounds above can be slightly improved to d > t + 2 and d > 2t + 2. 


Example 9.3.10 (i) Let q = 2, let g(z) = z and set L = F . The Goppa code 
T(L , g) is then {c e F^ -1 : c H J = 0}, where 

H = (1, a, a 2 , , a 2 ” 1 - 2 ). 


with a a primitive element of Fz^ . As we have seen in Example 9.2.4(i), this is 
none other than the binary Hamming code Ham(m. 2). 

(ii) For any q, take g(z) = z‘ and let L = {1, a~ l ,a ~ 2 , .... 
where a is a primitive element of F q m . (Hence, n = q m — 1.) Then F(L , g) = 
{ceFJ : c H r — 0}, where 


/ 1 a' a 2 ' 

1 a'" 1 a 2 «~" 


at"" 1 )r \ 

a (n-l)(r-l ) 


Via a 2 


1 ) 


Comparing with (8.1), we see that T(L, g) is precisely a narrow-sense BCH 
code. 

(iii) Let q = 2 and take g(z) = a 3 + z + z 2 , where a is a primitive element 
of Fg that satisfies a 3 + a + 1 = 0. Let L — Fg, so n — 8 and m = 3. Then 
r(L, g) = {ceF* : c H T = 0}, where 


H = 


a 4 a 4 a 1 a a 2 a 2 1 

0 a 4 a 2 a 2 a 4 a 6 1 a c 
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Replacing each entry in H by a column vector in F|, we obtain a matrix H 
which has the following reduced row echelon form: 

/l 0 0 0 0 0 1 l\ 

0 1 0 0 0 0 1 0 

0 0 1 0 0 0 1 1 

0 0 0 1 0 0 0 1 ’ 

0 0 0 0 1 0 1 0 

^0 0 0 0 0 1 1 ly 

so T(L,g) has a generator matrix 

/ 1 I 10 ;J: 1 1 0 \ 

\ 1 0 1 1 0 "% 0 1 J ' 

Therefore, T(L, g) has parameters [8, 2, 5]. 

In both Examples 9.3. 10(i) and (iii), the bound d > 2t + 1 in Proposition 
9.3.8 is attained. The following theorem shows the existence of a Goppa code 
of certain parameters. 

Theorem 9.3.11 There is a q-ary Goppa code T (L , g), where g(z) is an irre- 
ducible polynomial in ¥ q m [z] of degree t and L — Y q m , of parameters [q m , k, d], 
where k > q m — mt, provided 

£ <« - 1 )“’( < Q < V q ~ mt/2 )- W 

Proof. Write n = q m . Let c = (ci, . . . , c n ) e F" be of weight w, with c (1 / 
0, . . . , c iw / 0. Then c e T(L, g) if and only if R c (z) = 0 (mod g(z)). Since 
c has weight w, R c (z) = h e {z)/ nj=i( z — a i t \ where h c (z) has degree <w — 1 
and nj=i ( z — a ij ) has no common factor with g(z). Therefore, c € T(L , g) if 
and only if g(z) divides h c (z). The number of irreducible polynomials g(z) of 
degree t that can divide a given h c (z) is at most fw — l)/tj , so the number of 
T(L , g) containing a given c of weight w, with g(z) irreducible of degree t, is 
at most L(u; — l)/f_|. 

The number of c of a given weight w is (q — 1 )'" ( q y} ) , so the total number of 
T(L , g) containing at least a word of weight <d is < Yt=, + M-V w Ol(.™- 
\)/t J . (Since g(z) has degree t. Corollary 9.3.6 implies that T(L , g) does not 
have any nonzero words of degree <t, so the sum begins with w — t + 1.) 

The number of irreducible polynomials in F q m [z] of degree t is given by 
l q m (?) = ^X! s |r P-(s)q mt/s ^ j t (cf. Exercise 3.28), where p, is the Mobius func- 
tion. For 2 < s < t, clearly p{s)q mtls > —q mt/2 . Hence, with d(t) denoting 



202 


Goppa codes 


the number of positive divisors of t, we have 

V( 0 . = > 7(9"'- WO- i)<T f/2 ) > JcT'-d- 1)<T ,/2 ). 

1 sir 1 1 

Therefore, if (9.6) holds, then there is at least one irreducible polynomial 
g(z) in F q m [ z ] of degree t such that r(L , g) does not contain any nonzero word 
of weight < d; i.e., the minimum distance of T(L , g) is at least d. □ 


9.4 Sudan decoding for generalized RS codes 

For a linear code C, a list-decoding with error-bound r produces a list of all 
the codewords c e C that are within Hamming distance r from the received 
word. Consider the q - ary generalized Reed-Solomon code G RS k +i(ct, 1), 
where a = (a 1 , . . . , a„) with a, e F q , for 1 < i < n, and 1 = (1, . . . , 1); i.e., 

GRS k+ i(a, 1) 

= {(/(ai), /(a 2 ), . . . , /(a,,)) : /(*) e F q [x] and deg(/(x)) < k}. (9.7) 

Recall that GRSk+i(a, 1) is an [n, k + 1, n — £]-linear code over . 

In this section, we discuss an algorithm, due basically to M. Sudan, for a list- 
decoding for GRSk + i(a, 1). It is one of the most effective decoding schemes 
currently available for such codes. Modifications of this algorithm are also 
available for the decoding of some other codes discussed in this chapter, but we 
restrict our discussion to this generalized RS code. For more details, the reader 
may refer to refs. [6], [18] and [21]. 

For GRSk+i(a, 1) (0 <k < ri) and a received word (ji \ , fa, ■ ■ ■ , fa) € F”, 
let V = {(a,-, fa) : !</<«}, and let t be a positive integer < n. 

In general, a list-decoding with error-bound r = n — t solves the following 
polynomial reconstruction problem: 

CP, k , t [-reconstruction For V, k, t as above, reconstruct the set, denoted by 
Q(V, k, t), of all the polynomials f(x) e F 9 [x], with deg(/(x)) < k, which 
satisfy 


\{(a,fa£V : f(a) = fa\>t. (9.8) 

The Sudan algorithm is a polynomial-time list-decoding algorithm for 
GRSk+ i(a, 1) that solves the (P, k, ^-reconstruction problem in two stages 
as follows: 
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• Generation of the (' P,k, t)-polynomial. Generate a nonzero bivariate poly- 
nomial Qix, y) e F q [x, y], called the (' P,k, t)-polynomial, by solving a 
linear system in polynomial time such that y — f(x) divides Qix, y), for all 
f(x) G toiV, k , t). 

• Factorization of the (' P,k, t)-polynomial. Factorize the (V, k, t)-polynomial 
Q(x, y) and then output Q(V, k, t), which is the set of polynomials fix) e 

[jc], with deg(/(x)) < k, such that y — fix) divides Qix, y). 


9.4.1 Generation of the ( V , k, f)-polynomial 

We begin with some definitions. 

Definition 9.4.1 For a bivariate polynomial Q(x,y ) = y- / G 

V q \x , y], its x-degree, denoted deg x iQ), is defined as the largest integer i with 
q, j f=- 0, and its y-degree, denoted deg (<2), is defined as the largest integer j 
with q, j 0. 


Example 9.4.2 Let q = 2 and 

Q(x, y) = (x+ x 4 ) + (1 + x 4 )y + (1 + x)y 2 . 

Then, deg,.(<2) = 4 and deg v (<2) = 2. 

Definition 9.4.3 For an integer r > 0, a pair («, f ) e V 2 is called an r- 
singular point of Q{x,y) G F q \x,y] if the coefficients of the polynomial 
Qix + a, y + f) = ; q\ -x'y* satisfy q\ j = 0, for all i, j with i + j < r. 

Example 9.4.4 Let q = 2 and let Q(x,y) be as in Example 9.4.2. Consider 
the pair (1, 1) e ¥\. It can be checked easily that 

Q{x + \,y + Y) = x A y + xy 2 , 
so (1, 1) is a 3-singular point of Q(x, y). 

Lemma 9.4.5 Assume that {a, f) G F 2 is an r -singular point of Qix, y) G 
F q [x,y\. Then, for any fix) G F q [x] with fiat ) = f, ix — a) r divides 
Qix, fix)). 

Proof. Since (a, fi) G F 2 is an /--singular point of Qix, y), x r divides Qix + 
a, xy + f>), and thus ix — a f divides Qix, ix — a)g(x) + f>), for any g(x) e 
F ? [x]. As fia) = ft, we have that x — a divides fix) — f, so fix) = 
ix — ot)gix) + f, for some g(x) e F ? [x]. Hence, (x — a)' divides Qix, (x — 
a)gix) + 0) = Qix, fix)). | 
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Definition 9.4.6 A polynomial fix) e F ? [x] is called a y-root of Q (x , y) e 
[x , y] if Q(x, fix)) is identically zero, i.e., y — f(x) divides Q(x, y). 

Lemma 9.4.7 If all the pairs in V are r -singular points of Qix, y) e F q [x,y], 
which satisfies 

deg X (Q) + k deg y (Q) < rt, (9.9) 

then each polynomial in QfP, k, t) is a y-root of Qix, y). 

Proof. Assume that fix) belongs to Q(V, k, t ). From f(x) e F ? [x], with 
deg(/(x)) < k, and deg^Q) + kdeg y (Q) < rt, we see that Qix, fix)) is a 
polynomial of degree at most rt — 1. From Lemma 9.4.5, (x — «, ) r divides 
Qix, f{x)) for at least t distinct indices i . Hence, Qix, fix)) is identically 
zero, i.e., y — fix) divides Qix, y). □ 


Lemma 9.4.8 Ifm, l are nonnegative integers that satisfy m < k and 
m > + l\ (2m+kl + 2)il+\) 

|P| ( 2 ) < 2 ' 


(9.10) 


where \V\ is the cardinality ofP, then there exists at least one nonzero bivariate 
polynomial Qix, y) e F ? [x, y], satisfying 

deg x (<2) + k de g> ,(<2) <m+ki, (9.11) 

such that all the pairs in V are r -singular points of Qix, y). 


Proof. By Exercise 9.12, the pairs in V are r -singular points of a bivariate 
polynomial Qix, y) = |S| j qijx'y 1 e F ? [x, y] if and only if the constraint 

Y, ^P'~ j = Q (9.12) 

holds for all the pairs (a, fi) e V and for all the nonnegative integers i, j with 
i + j < r. The number of constraints of the form (9.12) is equal to \P \ ( r ^'). 
From (9.1 1) and m < k, the number of unknowns in the constraints (9.12) is 
equal to 

| = (2« + kl + 2)(l + 1) (9 B) 

7=0 i =0 2 

Thus, from (9.10) and from the fact that the constraints (9.12) are linear in 
the unknowns, we conclude that a nonzero bivariate polynomial Qix, y) = 
j QUX , y i , satisfying the constraints (9.12), does exist. □ 
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Definition 9.4.9 A sequence ( l,m,r ) of nonnegative integers is called a 
(P, k, t)-sequence if m < min{£, rt — Ik} and (9.10) holds. 

Theorem 9.4.10 Ift >s/k\V |, a (P, k, t)-polynomial Q(x, y) = £L • qtjx'y^e. 
F q [x, y] with deg v ( Q ) = O (fk j P | 3 ) can be found in polynomial time by 
solving a linear system whose constraints are of the form (9.12). 

Proof. Let {l, m, r) be the (P, k, t (-sequence given in Exercise 9.13. From 
Lemma 9.4.8, a nonzero bivariate polynomial Q(x, y) satisfying (9.1 1) can be 
found in polynomial time by solving a linear system with constraints of the 
form (9.12). 

Since m + Ik < rt, it follows from (9.11) and Lemma 9.4.7 that all the 
polynomials in QCP, k, t ) are y-roots of Q(x, y). Hence, Q(x, y) is a (P, k, t)- 
polynomial. 

By the choice of r in Exercise 9.13, we see that r = 0(k\V\/(t 2 — k\V\)). 
Therefore, 

deg y (<2) < t = 0{t\V\/{t 2 - k\V\)). (9.14) 

Let to be the smallest integer such that fg — k\P\ > 1. Then t () = 0(*fk\V\). 
Now, r/(r 2 — A:|P|) is monotone decreasing in t for t > <Jk\P\, so it follows 
from (9.14) that 

deg y (<2) = 0(\V\Jk\V\) = 0(v / W). 

This completes the proof of Theorem 9.4. 10. □ 

Remark 9.4.11 The y-degree of a (P, k, t (-polynomial can serve as an upper 
bound for the cardinality of £2(P, k, t). 


9.4.2 Factorization of the (P, k, t)-polynomial 

To reconstruct the set Q(V, k, t), it is enough to find all the y-roots fix) e 
F q [x}, with deg (f{x)) < k, of a (P, k, t (-polynomial Q (x , y). Since many 
efficient algorithms for factorizing univariate polynomials over ¥ q are available 
in the literature (see, for example, Chap. 3 of ref. [14]), we do not discuss here 
the factorization of such polynomials. 

Lemma 9.4.12 Assume that fo(x) = JL> 0 a,-jc' e F ? [x] is a y-root of a 
nonzero bivariate polynomial Qo(x,y). Let Qq(x, y) = Qo(x, y)/x a ° and 
Qi(x, y) = Qq(x , xy+ao), where oq is the largest integer such that x n " divides 
Qo(x , y). Then, ao is a root of the nonzero univariate polynomial Q* t (0, y), and 
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/i (x) = Xl(>o a i+i x ' e M is a y-root of the nonzero bivariate polynomial 
Qi(x,y ). 

Proof. From the definition, we see easily that both Qq( 0, y) and Q \ (x , y) 
are nonzero polynomials. Since fo(x) is a y-root of Qo(x, y), it means that 
Qo(x, fo(x)) is identically zero. Then, we have 

Go(*. /o«) = Qo(x, Mx))/x« = 0, (9.15) 

and thus <2 q( 0, ao) = <2 q( 0, /o(0)) = 0; i.e., ao is a root of <2g(0, )0- 

From (9.15), we also have that Qi(x, f\(x)) = Qq(x, fo(x)) = 0; i.e., f\(x) 
is a y-root of Q \ (x , y ). □ 

Lemma 9 . 4.13 Assume that Qq(x, y) e F 9 [x, y] is a nonzero bivariate 
polynomial and that a e V q is a root of multiplicity h of <2g(0, y). Let 
Qi(x, y) = Qq(x, xy + a) and let Q\{x, y) — Q\{x, y)/x° l , where ox is the 
largest integer such that x ai divides Qi(x, y). Then the degree of the univariate 
polynomial <2*(0, y) is at most h. 

Proof. We assume that 

G(x, y) := Q* 0 (x,y+a) = J2si(x)y‘. (9.16) 

i>0 

Since a is a root of g { *(0, y) of multiplicity h, 0 is a root of multiplicity h of 
G{x, y). Thus y,(0) = 0, for i = 0, 1, . . . , h — 1, and g/,( 0) f 0. Then, from 
Q i (x, y) = G(x,xy), we know that x divides G(x,xy ) but x h+1 does not. 
Hence, l <<r\ <h. It follows from Q*(x, y) = Q \ (x , y)/x a ' = G(x, xy)/x a ' 
and (9.16) that 

= + E SiMx^y. (9.17) 

i=0 X i>a i + l 

Hence, Q *(0, y) is a univariate polynomial of degree at most <X| < h. □ 

For a nonzero bivariate polynomial Qo(x, y) e V q [x , y] and a positive inte- 
ger j, let Sj(Q o) denote the set of sequences (ao, a\, . . . , a 7 _i) e such that 
a, is a root of Q*(0, y), for i = 0, 1, . . . , j — 1, where Q*(x, y) = Qi(x, y)/x a ‘ 
with x a ‘ exactly dividing Q,(x, y), and Qi+ \ (x, y) = Q*(x, xy+a ( ). Applying 
Lemmas 9.4.12 and 9.4.13, we obtain the following theorem. 

Theorem 9 . 4.14 For any nonzero bivariate polynomial Q{x,y) e F (/ [x, y] 
and any positive integer j , the cardinality of Sj(Q ) is at most deg (<2), 
and, for each y-root fix) = £L> 0 a,-.v' e F ? [x] of Q(x,y), the sequence 
(ao, ai, . . . , a,-_i) belongs to Sj(Q). 
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Example 9.4.15 Let <5 be a primitive element in Fg satisfying <5 3 + 5 + 1 = 0. 
(Unlike in the earlier parts of this chapter, we do not use a to denote a primitive 
root here as a has already been used for other purposes in this section.) Find 
all the y-roots f(x) e Fg[x], with deg(/(x)) < 3, of the following bivariate 
polynomial: 

Q(x, y ) = (5 5 x + S 2 x 3 + (5 6 x 4 + S 2 x 5 + 8 5 x 6 + 8 1 x 1 + x 8 ) 

+ ( 5 4 + <$ 3 x 2 + 8 5 x 4 )y + {8x + S 4 x 3 + 8 2 x 4 )y 2 + y 3 . 

Solution. We have Q^(x , y) = Q(x, y) and Qq( 0, y) = 8 4 y + y 3 , which has 
two roots 0 and 8 2 . The multiplicity of the latter root is equal to 2. 

Case 1: For the root 0 of Qq( 0, y), we have 

Q\ (x,y) = Ql(x,xy)/x 

= (<$ 5 + 8 2 x 2 + 8 6 x 3 + 8 2 x 4 + 8 5 x 5 + S 2 x 6 + x 1 ) 

+ ( 8 4 + 8 3 x 2 + S 5 x 4 )y + (8x 2 + S 4 x 4 + S 2 x 5 )y 2 + x 2 y 3 
and Gi(0> >’) = <5 5 + S 4 y, which has the unique root 8. Then 
Q*i (x,y) = Q*(x,xy + 8)/x 

= ( 8x + 8 6 x 2 + 8 2 x 3 + x 4 + 8 2 x 5 + x 6 ) + ( 8 4 + 8x 2 + 8 3 x 4 )y 
+ (<5V + 8 2 x 6 )y 2 + x 4 y 3 

and g 2 (tf y) = 5 4 y, which has the unique root 0. Hence, 

Q%{x,y) = Q* 2 (x,xy)/x 

= (8 + 8 6 x + 8 2 x 2 + x 3 + 8 2 x 4 + x 5 ) + (8 4 + 8 x 2 + S 5 x 4 )y 
+ (8 4 x 6 + 8 2 x 1 )y 2 +x 6 y 3 

and G^CO’ y) = & + 5 4 y, whose unique root is 8 4 . Thus, we have (0, 8, 0, <5 4 ) e 

s 3 (C). 

Case 2: For the root 8 2 of Q* } (0, y), we have 
Q*(x, y) = Q*q(x, xy + 8 2 )/x 2 

= ( 8 5 + 8 4 x + x 2 + 8 2 x 3 + <5 5 x 4 + 8 2 x 5 + x 6 ) + (5 3 x + 8 5 x 3 )y 
+ ( 8 2 + 8x + <5 4 x 3 + S 2 x 4 )y 2 + xy 3 
and G*(0’ >’) = <5 5 + 8 2 y 2 , which has a root <5 5 (of multiplicity 2). Then 
Q* 2 (x, y) = Q* (x, xy + S 5 )/x 2 

= (1 + 8 4 x + 8 2 x 3 + x 4 ) + 8 5 x 2 y 
+ ( 8 2 + 8 6 x + <5 4 x 3 + 8 2 x 4 )y 2 + x 2 y 3 
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and Q*( 0, y) = 1 + S 2 y 2 , which has a root <5 6 (of multiplicity 2). Hence, 
Q* 3 (x,y) = Q* 2 (x,xy + S 6 )/x 2 

= ( S 2 + S 6 x + S 6 x 2 + S 4 x 3 + S 2 x 4 )y 2 + x 3 y 3 
and g|(0, y) = S 2 y 2 , which has a root 0 (of multiplicity 2). Thus, we have 

(S 2 ,S 5 ,S 6 ,0)eS 3 (g). 

We have just shown that S 3 (g) = {(0, S, 0, <5 4 ), (<$ 2 , <$ 5 , S 6 , 0)}. The polyno- 
mials related to the sequences in S 3 (<2) are 

f(x ) = Sx + S 4 x 3 and g(;c) = S 2 + S 5 x + S 6 x 2 . 

Since y — g(x) divides Q(x, y) but y — f(x) does not, the bivariate polynomial 
Q(x, y) has a unique y-root g(x) = S 2 + S 5 x + S 6 x 2 of degree < 3 in Fg[x]. 

Indeed, we can also show that S 4 (Q) = {(0, S, 0, <$ 4 , S 2 ), ( S 2 , S 5 , S 6 , 0, 0)} 
and then find that h (x) = 8x+8 4 x 3 + 8 2 x 4 e F 8 [x], with deg(/?(,t)) < 4, is also 
a y-root of Q(x, y). Furthermore, from Q(x, y)/((y — g(x))(y — h(x))) = y — 
g(x), we see that Q(x, y) can be factorized as Q(x, y) = (y — g(x)) 2 (y — h(x)). 

According to Theorem 9.4.14, the following recursive factoring algorithm 
computes all the y-roots f{x) e F ? [x] of degree < k of a bivariate polynomial 
Q(x, y) with the help of any factoring algorithm of a univariate polynomial 
over F 9 . 


Factoring algorithm 

Input: A nonzero bivariate polynomial Q(x, y) e ¥ q \x, y] and a 
positive integer k. 

Output: The set S2 of y-roots fix) e F ? [x], of degree < k, of 
Q(x, y). 

Step 1: Define Q*(x,y) Q(x,y)/x a , where r] denotes a 
sequence of length 0 and a is the number such that x a 
exactly divides Q(x, y). Set j 2, S as the set of the roots 
of 2*(0, y), S' <- 0 and goto Step 2. 

Step 2: For each s = (s', a) e S, do 

(i) define g*(x, y) := Q*fx, xy + a)/x a , where cr is the num- 
ber such that x a exactly divides Q*,(x, xy + a); 

(ii) factorize <2g(0, y) and, for each root ft, add (s, ft) 
into S'. 

Goto Step 3. 
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Step 3: If j = k + 1, goto Step 4. 

Else, set j <— j + 1, S <— S', S' •<— 0 and goto Step 2. 

Step 4: Output the set £2 of polynomials f(x) = ^* =0 «,x' e 
F ? [x], with degree < k, for which (ao, a \, .... a^) e 5' 
and y — /(x) divides Q (x , y). 

END. 

Remark 9.4.16 The above factoring algorithm can be speeded up to some 
extent by using the result in Exercise 9.16. 


Exercises 

9.1 Show that GRSk(a , v) = GRSk(a, w) if and only if v = A.w for some 
Ae F *. 

9.2 Let 


< Vi v 2 

Vidi V2d2 

\ V\d\~ x v 2 d\~ l 


V n \ 


VnOt k ~ l ) 


be a generator matrix for the generalized RS code GRSk(ot , v) and let C 
be the code with generator matrix (G|u T ), where u = (0, . . . , 0, u), for 
some u e F*. Let v' = (v[, , v' n ) be such that GRS n -k(a, v') is the 
dual of GRSk(a, v). 

(i) Show that there is some w e F* such that X!/=i v i v 'i a T' + win = 0. 

(ii) Show that 


H' = 


Vl«l 

v\ a? 


\ v'.d" 


v 2 d 2 

v^di 


v„d n 

v'd~ 


0\ 

0 

0 


is a parity-check matrix for C. 

(iii) Show that any n — k + 1 columns of H' are linearly independent. 

(iv) Prove that C is an MDS code. 
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9.3 Let 



be an invertible matrix, where a, 7 e F q m, for all 1 < i, j < t. For 
1 <i <t, let 


fi(x) = an + a i2 x + a i2 x 2 H b a it x‘ 1 . 

Show that, for c e ¥ n q , a = (a\, . . . , a n ) and v' = (uj . . . . , v' n ), we have 
c € A„- t (a, v) if and only if cfl n = 0, where 



/t(at) 

«4/t(a2) • • • 

v'„f\(a, 

/2(«l) 

Vifiiaf) ■ ■ ■ 

v'nfiia, 

Mai) 

v' 2 ft(af) ■ ■ ■ 

v' n ft(a, 


(Note: this is the reason for the name ‘alternant code’, as a matrix or 
determinant of the form 


^/i(«i) h(a\) ■■■ /r(tti) 
fiiaf) flioci) ■■■ /f(a 2 ) 

v /i(a„) fi(a n ) f,(a n ) 


is called an alternant.) 

9.4 Let gcd(n, q)= 1 and let ¥ q m be the smallest extension of F ? containing 
all the nth roots of 1. Let a be a primitive nth root of 1 in F q m, so 
{1, a , . . . , a" -1 } c F ? » are all the nth roots of 1. Forc(;r) = Xn=o <4 x' £ 
F ? [x], let c(z) e F q m[z] be dehned by 


n n - 1 

c(z) = ^c 7 z" -7 , where dj = c(ot') = c,a'' . 

i = 1 1=0 

(Note: the polynomial c(z) is called the Mattson-Solomon polynomial or 
the discrete Fourier transform of c(x).) 

(i) Show that c(x) = - 1 c(a l )x l . 

n *—"-0 

(ii) For a polynomial f{x), recall that (fix) (mod x" — 1)) denotes 
the remainder when f(x) is divided by x" — 1. For polynomials 
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fix) = E"=o f< x ' 311(1 Six) = E"=o let 

n - 1 

f(x)*gix ) = Y^fiSiX 1 - 

i = 0 

(a) Show that (/Tg)(z) = /(z) + £(*)• 

(b) Show that/?(x) = (f(x)g(x) (mod a" — 1)) if and only if Mz) = 
/(z) * g(z). 

(c) Show that h(z) = -(/(z)f(z) (mod z" — 1)) if and only if 
/*(*) = f(x)*g(x) n 

9.5 Let the notation be as in Exercise 9.4. Let /(z), g(z) g [z] be 
polynomials relatively prime to z" — 1 with deg(/(z)) < n — 1 and 
t = deg(|(z)) < n — 1. Let GBCH{f , g) be defined as 

GBCH{f, g) = {(c 0 , . . . , c n —\) G F” : (c(z)/(z) (mod z n - 1)) 

= 0 (mod £(z))}, 

where c(x) = c i x ' ■ Let fix), gix) G F ? [x] be such that /(z), g(z) 
are their respective Mattson-Solomon polynomials. 

(i) Show that, if fix) = E"=o f> x ' and i?(^) = E"=o Si^, then /, ^ 0 
and gj ± 0, for all 0 </'</?— 1 . 

(ii) Show that the following conditions are equivalent: 

(a) c = (c 0 , . . . , c„_i) G GBCH{f, g)- 

(b) there is a polynomial u{z) with deg(w(z)) < n — t — 1 such that 

(c(z)/(z) (mod z” - 1)) = u(z)g(z); 

(c) there is a polynomial u(x) G F 9 »[x] such that c(x) * f(x) = 
u(x) * g(x) and Uj = 0 for 1 < j < t, where ii(z) = 
E"= i UjZ n ~ j is the Mattson-Solomon polynomial of 

Hx)= Er=o u ‘X l ; 

(d) there exist «o, • • • , u n-\ e F (/ * such that c, /, = Ujgj, for 0 < 
i < n — 1, and uj = 0, for 1 < j < t; 

(e) iij = E'=o Cifa'i/gi = 0, for all 1 < j < t. 

(iii) Show that c G GBCH(f, g) if and only if c H 1 = 0, where H is 
equal to 



(Note: therefore, GBCH(f, g) is an alternant code. It is called a 
Chien—Choy generalized BCH code.) 
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9.6 Let n be odd and let F 2 m be an extension of F 2 containing all the «th roots 
of 1 . Let a be a primitive nth root of 1 in F 2 » and let L = {l,a, . . . , a" -1 }. 
For c = (c 0 , . . . , c„_ 1 ) e F!j, let R c (z) = c i/( z + a ')> as in Defi- 
nition 9.3.1. Let c(x) = YTiZi 0 c i x ‘ and let c(z) be its Mattson-Solomon 
polynomial. 

(i) Show that c(z) = (z(z" + \)R c (z) (mod z" — 1)) and 


f?c(z) = E 
;=0 


c(q') 

z + a r 


(ii) Show that the Goppa code T(L , g) is equal to 

r(L,g) = {c e Fj : (z" _1 c(z) (mod z" - 1)) = 0 (mod g(z))}. 

(Hint: For (i), show that z(z" + I )/? c (z) = c ‘ z Y\j^ z + «')• Then 
show that (z (z + a 7 ) (mod z" — 1)) = Eq=o a ~ ,jzj by multiplying 
both sides by z + a 1 . For (ii), show that c e T(L, g) if and only if 
E"=d Ci n^(z + a j ) = 0 (mod g(z)), and then use (i).) 

9.7 Let the notation be as in Exercise 9.6 and suppose that T(L , g) is a cyclic 
code. Show that g(z) = z r for some t and, when n = 2 m — 1, that T(L , g) 
is a BCH code. 

9.8 Let ai, . . . , a„, iui, . . . , w, be distinct elements of F^ and let z i , . . . , z„ 
be nonzero elements of F ? «». Let C = {c e F" : c H T = 0), where 


/ zi/(ai - uq) z 2 /(a 2 - tui) • • • z„/(a„ - uq) \ 

zi/(ai — W 2 ) z 2 /{a 2 -W 2 ) ••• z„/(a„ — w 2 ) 

V zi/(ai - w t ) z 2 /{a 2 -w t ) ■■■ z„/(a„ - w t ) 


Show that C is equivalent to a Goppa code. (Note: this code is called a 
Srivastava code.) 

9.9 When m — 1 in Exercise 9.8, show that the Srivastava code C is MDS. 

9.10 Let C be the binary cyclic code of length 15 with x 2 + x + 1 as the 
generator polynomial. Show that C is a BCH code but not a Goppa code. 

9.1 1 Let L — Fg and let g(z) — 1 + z + z 2 . Find the extended binary Goppa 
code T(L , g) and show that it is cyclic. 

9.12 Assume that Q(x, y) = JL ; qijx'y j e ¥ q [x, >’] and (a, [i) e F^. Prove 
that the coefficients of Q(x+ a, y+fi) = E; / d'i j x 'y 2 e F^ [x , v] satisfy 



for all nonnegative integers i and j. 
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9.13 For P c Y 2 , let y = k\V\. Assume that t > <Jy. Prove that (£, m, r ) is 
a (P, k, f)-sequence, where 

Y + Jy 2 + 4 (t 2 - y) 

l 2 (P - y) 

rt - 1 I 
m = rt - 1 - 

9.14 For every integer k such that 0 < k < n, find (or design an algorithm 
to compute) the smallest positive number T(n,k) such that, for any t 
with T(n,k) < t < n and P c Fj with |P| = n, there is at least one 
(P, k, f)-sequence. 

9.15 For integers n, k, t satisfying 0 < k < n and T(n,k ) < t < n, find 
(or design an algorithm to compute) the smallest positive number £ — 
L(n,k, t ) such that, for any set PcFj with |P| — n, there is at least one 
(P, k, t (-sequence of the form (£, m, r). 

9.16 Let Qo(x ,y) eF,[r, y] be a nonzero bivariate polynomial. Assume that 

(a 0 ,au a/) € S j+l (Q 0 ), Q*(x, y) = Qi(x, y)/x a> , where x a ‘ exactly 

divides <2,(x, y), and Q,- + i(x, y) = Q*(x, xy + a,-), for i = 0, 1, . . . , j. 
Prove that 

(i) X!/=o a ‘ x ' ' s a >’ _root of Qo(x, y) if and only if Q ;+ 1 (x . 0) = 0; 

(ii) if a j — 0 and there is a positive number h such that 

Q*(x, y) = ^giWy' and g*(0) ^ 0, 
i>h 

then, for any / > j and sequence (bo, b\ , . . . , by- 1 ) e Sy(Qo) with 
bj = cij, for all 0 < i < j, the equality b/ = 0 must hold for all l 
such that j < l < j'. 
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